Questions tagged [dbplyr]
dbplyr: A 'dplyr' Back End for Databases
dbplyr
402
questions
55
votes
0
answers
2k
views
How might I get detailed database error messages from dplyr::tbl?
I'm using R to plot some data I pull out of a database (the Stack Exchange data dump, to be specific):
dplyr::tbl(serverfault,
dbplyr::sql("
select year(p.CreationDate) year,
...
14
votes
3
answers
6k
views
How to join tables from different SQL databases using R and dplyr?
I'm using dplyr (0.7.0), dbplyr (1.0.0), DBI 0.6-1, and odbc (1.0.1.9000). I would like to do something like the following:
db1 <- DBI::dbConnect(
odbc::odbc(),
Driver = "SQL Server",
Server ...
14
votes
4
answers
1k
views
collect only if query returns less than n_max rows
Occasionally when connecting to my Oracle database through ROracle and dbplyr I will run a dplyr::collect operation that fetches more data than expected and than R can handle.
This can make R crash ...
13
votes
1
answer
7k
views
Avoiding warning message “There is a result object still in use” when using dbSendQuery to create table on database
Background:
I use dbplyr and dplyr to extract data from a database, then I use the command dbSendQuery() to build my table.
Issue:
After the table is built, if I run another command I get the ...
13
votes
0
answers
12k
views
dbplyr - Error: x and y don't share the same src. Set copy = TRUE to copy y into x's source (this may be time consuming)
Normally we do not find any trouble in using the below connection method and run queries from redshift
require("RPostgreSQL")
drv <- dbDriver("PostgreSQL")
conn <- dbConnect(drv, dbname = "...
12
votes
3
answers
13k
views
Error: ! Failed to collect lazy table. Caused by error in `db_collect()` - using biomaRt package in R
I'm currently working on a bioinformatics project using R, and I'm encountering an error when trying to use the biomaRt package. After installing the package and loading it into R, I tried to select a ...
12
votes
2
answers
7k
views
Create the SQL query "SELECT * FROM myTable LIMIT 10" using dplyr
Suppose I have a connection to an external database called con.
I would like to use dplyr to reproduce this query
SELECT var1, var2, var3 from myTable LIMIT 10
I have tried
qry <- tbl(con, "...
12
votes
3
answers
2k
views
Non-equi join in tidyverse
I was wondering whether someone knows if the dplyr extension packages (dbplyr and dtplyr) allow non-equi joins within the usual dplyr workflow? I rarely need data.table, but fast non-equi joins are ...
11
votes
1
answer
4k
views
Mutate variables in database tables directly using dplyr
Here is mtcars data in the MonetDBLite database file.
library(MonetDBLite)
library(tidyverse)
library(DBI)
dbdir <- getwd()
con <- dbConnect(MonetDBLite::MonetDBLite(), dbdir)
dbWriteTable(...
11
votes
1
answer
380
views
How to escape Athena database.table using pool package?
I'm trying to connect to Amazon Athena via JDBC and pool:
What has worked so far:
library(RJDBC)
library(DBI)
library(pool)
library(dplyr)
library(dbplyr)
drv <- RJDBC::JDBC('com.amazonaws....
10
votes
2
answers
3k
views
How to pipe SQL into R's dplyr?
I can use the following code in R to select distinct rows in any generic SQL database. I'd use dplyr::distinct() but it's not supported in SQL syntax. Anyways, this does indeed work:
dbGetQuery(...
10
votes
1
answer
3k
views
How to use dplyr tbl on a SQL Server non-standard schema table
My question is how can I use dplyr functions, such as tbl, on SQL Server tables that do not use the default "dbo" schema?
For more context, I am trying to apply the R database example given here to ...
9
votes
1
answer
2k
views
dbplyr, dplyr, and functions with no SQL equivalents [eg `slice()`]
library(tidyverse)
con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
copy_to(con, mtcars)
mtcars2 <- tbl(con, "mtcars")
I can create this mock SQL database above. And it's very cool that I ...
9
votes
1
answer
1k
views
Update a table using subquery in SQLite
I want to add a column to my table using ALTER TABLE and UPDATE statements not to recreate the full table.
When using a subquery in my UPDATE statement I don't get the output I expect.
build ...
9
votes
1
answer
233
views
Does sql_variant in dbplyr work as it should?
Let's take a look at the example in ?sql_variant:
We define a new translator function for aggregated functions, expanded from the default one:
postgres_agg <- sql_translator(.parent = base_agg,
...
8
votes
1
answer
5k
views
dbplyr mutate character to date format in temp table
I have extracted data to a temporary table in SQL Server using DBI::dbGetQuery.
Even though, in the real query (not the play query below), I
select convert(date, date_value) as date_value, the ...
7
votes
3
answers
4k
views
How to filter by a string containing variables in dbplyr [duplicate]
I normally use filter with grepl in dplyr, but when using dbplyr. I get an error that grepl is not a recognized function. My guess is that it can't translate to SQL server. What is a way around this ...
7
votes
2
answers
1k
views
R and dplyr: How can I use compute() to create a persistent table from SQL query in a different schema than the source schema?
I have a question similar to this Stackoverflow post.
How can I create a persistent table from a SQL query in a database (I use a DB2 database)? My goal is to use a table from one schema and to ...
7
votes
1
answer
4k
views
Adding column to sqlite database
I am trying to add a vector which I generated in R to a sqlite table as a new column. For this I wanted to use dplyr (I installed the most recent dev. version along with the dbplyr package according ...
7
votes
1
answer
5k
views
Connecting to Microsoft SQL Server with R (view is in a database in Microsoft SQL Server Management Studio (SSMS)
I have reading rights to some "Views" (tables) in Microsoft SQL Server Management Studio (SSMS). I connect, make my query and export a files as csv and then read it in R. Now I would like to make my ...
7
votes
2
answers
919
views
How to select a nested field with bigrquery using dplyr syntax?
I'd like to explore a Google Analytics 360 data with bigrquery using dplyr syntax (rather than SQL), if possible. The gist is that I want to understand user journeys—I'm interested in finding the most ...
7
votes
1
answer
2k
views
R: Best Practices - dplyr and odbc multi table actions (retrieved from SQL)
Say you have your tables stores in an SQL server DB, and you want to perform multi table actions, i.e. join several tables from that same database.
Following code can interact and receive data from ...
7
votes
1
answer
3k
views
cant access string methods in dbplyr
I am trying to use str_detect, str_replace, str_replace_all methods in dbplyr with oracle as the beckend database but cant seem to access this methods.
here is the error:
db_tbl %>% mutate(...
7
votes
1
answer
877
views
dbplyr::in_schema case sensitive
The function dbplyr::in_schema() can not connect to tables with uppercase letters.
When I create a table in PostgreSQL.
CREATE TABLE public."OCLOC"
(
cod_ocloc double precision NOT NULL,
...
7
votes
2
answers
592
views
How to create custom SQL functions with R code in dbplyr?
I am using dbplyr to query an MSSQL database, and frequently round dates to the first of the month using mutate(YM = DATEFROMPARTS(YEAR(Date), MONTH(Date), 1)). I would like to be able to create an R ...
6
votes
2
answers
445
views
How do I "flush" data to my RSQLite disk database?
I'm creating a database using R package dbplyr, using RSQLite, but my database is zero-bytes in size on disk despite my writing (and reading back) a table. Here is my script:
library("RSQLite")
...
6
votes
3
answers
2k
views
How to use custom SQL function in dbplyr?
I would like to calculate the Jaro-Winkler string distance in a database. If I bring the data into R (with collect) I can easily use the stringdist function from the stringdist package.
But my data ...
6
votes
1
answer
493
views
Generate CROSS JOIN queries with dbplyr
Given 2 remote tables (simulated with tbl_lazy for this example)
library("dplyr")
library("dbplyr")
t1 <- tbl_lazy(df = iris, src = dbplyr::simulate_mysql())
t2 <- tbl_lazy(df = mtcars, src = ...
6
votes
1
answer
617
views
how to generate SQL from dbplyr without a database connection?
I currently have access to an Apache Hive database via the beeline CLI. We are still negotiating with IT to get R on the server. Until that time, I would like to (ab)use the R dbplyr package to ...
6
votes
2
answers
670
views
Dropping containing NA rows with dbplyr
here is how I ran some SQL queries by dbplyr
library(tidyverse)
library(dbplyr)
library(DBI)
library(RPostgres)
library(bit64)
library(tidyr)
drv <- dbDriver('Postgres')
con <- dbConnect(drv,...
6
votes
1
answer
4k
views
left_join for tbl: na_matches not working
left_join works as expected with NA values on tibbles or data frames, but on tbl it seems it does not match NAs, even with the option na_matches = "na".
R version and package versions
> ...
6
votes
1
answer
2k
views
connecting to a database in R using an Office Data Connection (.odc) file from Power BI
I have been asked to make a bunch of charts for a large organization and have been given access to their Power BI dashboard. I want to go around Power BI's interface so I can make the charts in R. ...
6
votes
2
answers
288
views
Function that composes functions with existing sql translations in dbplyr
This question arises because I wish to make a function for my convenience:
as.numeric_psql <- function(x) {
return(as.numeric(as.integer(x)))
}
to convert boolean values in a remote postgres ...
5
votes
2
answers
2k
views
How to get field names from a tbl_dbi?
Is there a way to directly get the field names from a tbl_dbi object (db_mtcars below)?
library(RSQLite)
library(dbplyr)
library(dplyr)
con <- dbConnect(RSQLite::SQLite(), ":memory:")
...
5
votes
1
answer
771
views
Database calculations with dbplyr
I have very simple problem that produces error. Example will clear this one.
library(odbc)
library(DBI)
library(dplyr)
library(dbplyr)
con <- dbConnect(odbc(), "myDSN")
tbl_test <- tibble(ID =...
5
votes
1
answer
2k
views
Sparklyr using case_when with variables
Sparklyr fails when using a case_when with external variables.
Working Example:
test <- copy_to(sc, tibble(column = c(1,2,3,4)))
test %>%
mutate(group = case_when(
column %...
5
votes
1
answer
427
views
Using string matching like grepl in a dbplyr pipeline
dbplyr is very handy as it convert dplyr code into SQL. This works really well except when it doesn't. For example i am trying to subset rows by partially matching a string against values in a column. ...
5
votes
2
answers
385
views
How to escape characters in SQL code in an R Markdown chunk?
```
{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
library(odbc)
library(DBI)
library(dbplyr)
```
```{sql, connection=con, output.var="df"}
SELECT DB_Fruit.Pear, Store....
5
votes
2
answers
451
views
dplyr Filter Database Table with Large Number of Matches
I am working with dplyr and the dbplyr package to interface with my database. I have a table with millions of records. I also have a list of values that correspond to the key in that same table I ...
5
votes
1
answer
3k
views
Error: The dbplyr package is required to communicate with database backends
I am new to R and learning R from sources.
I am trying to use the dplyr package for connecting to the database.
I am trying out the following tutorial, and getting this error
https://github.com/...
5
votes
1
answer
1k
views
Non-Latin characters show as question marks when using rodbc/odbc/dbplyr with SQL-Server
I'm using dbplyr to get data from SQL-Server into R, but Chinese, Japanese and other non-Latin characters are appearing as "?". I'm using a windows machine.
I've read through the following threads:
...
5
votes
2
answers
2k
views
Combining dbplyr and case_when in SQL Server
I am using dbplyr to write and run queries in SQL Server, and want to apply a conditioned mutate. This can be done using ifelse or using case_when. The query works when using ifelse but throws and ...
5
votes
1
answer
771
views
R: dbplyr: postgres: How to create an index on a table
A user has a large table (3+ billion rows).
To speed up queries for the next few months, an index on the remote database must be created.
Assuming there is a connection called conn - what is the ...
4
votes
2
answers
667
views
Convert dplyr pipeline into SQL string
I would like to convert a (short) dplyr pipeline into a string representation of its equivalent SQL. For example:
library(dplyr)
dbplyr::lazy_frame() %>% filter(foo == 'bar')
will print out ...
4
votes
3
answers
280
views
How to use `last()` when mutating by group with {dbplyr}?
Consider the following remote table:
library(dbplyr)
library(dplyr, w = F)
remote_data <- memdb_frame(
grp = c(2, 2, 2, 1, 3, 1, 1),
win = c("B", "C", "A", "B&...
4
votes
2
answers
1k
views
Use variable with regex in string::str_detect in dbplyr SQL query
I would like to filter a SQL database based whether a regular expression appears within any column. I would like to specify the regex as a variable; however it is read as a literal string. I am having ...
4
votes
2
answers
1k
views
Saving a flat file as an SQL database in R without loading it 100% into RAM
I hope that what I am about to write makes some sense.
If you look at
How to deal with a 50GB large csv file in r language?
it is explained how to query à la SQL, a csv file from R.
In my case, I have ...
4
votes
2
answers
719
views
R: Update a mysql table with data frame
I have a MariaDB and I want to update a table with a local R data frame. As an example, I have a table with these column names:
id,foo,bar
id is the primary key on the data base table.
Is there a ...
4
votes
3
answers
383
views
Store new permanent table in schema using compute
I want to use dbplyr syntax to do some JOIN / FILTER operations on some tables and store the results back to the Database without collecting it first.
From what I read compute(..., temporary = FALSE, ...
4
votes
1
answer
709
views
Add postgres time interval using dplyr/dbplyr
I have a database connection in R and would like to implement the following filtering step---in Postgres---using dplyr (v0.5):
WHERE time1 < time2 - INTERVAL '30 minutes'
(see https://www....