Click here to download the books.sqlite file. We will use this with the code below. Note that a sqlite file can also have different extensions, such as .db (or anything else). There is no official standard for the extension.
23.2 Going beyond sqldf
###############################################################################.## "going beyond sqldf" - Connecting to databases through R - ## In this section we will go beyond using sqldf to manipulate dataframes in R.# We will learn about how to interact with an actual database management system# from within R.################################################################################.###############################################################################.## Resources for learning more about working with databases in R## Best Practices in Working with Databases (in RStudio)# https://solutions.posit.co/connections/db/## Datacamp Tutorial# https://www.datacamp.com/community/tutorials/sqlite-in-r## vignette("spec", package="DBI")## help(package="DBI")## Functions we will use:# dbConnect# dbExecute # create table, insert , update, delete, etc.## dbGetQuery # select (interactive work)## dbSendQuery # select (batched work)# dbFetch# dbClearResult###############################################################################.#-------------------------------------# sqldf is not enough#-------------------------------------# As you have already learned, SQL is a standard language for manipulating data# that is stored in a relational database. There are many different relational# database management system products and each one uses SQL to manipulate the# data in the database.# We started learning about the SQL select statement by using the sqldf# function from the sqldf package to manipulate R dataframes with # SQL select statements. However, sqldf does NOT allow you to manipulate# data that is stored in an actual database. The purpose for sqldf is to allow# R programmers who are familiar with SQL SELECT statements # to use their knowledge to manipulate R dataframes. This is very useful as # SQL is an extremely popular language that is widely known and understood.# However, in order to use SQL to manipulate data in an ACTUAL database, R programers# must use functions from packages other than sqldf. #-------------------------------------------------------------------# More about Relational Database Management System (RDBMS) software#-------------------------------------------------------------------# Relational database management system (RDBMS) products aim to provide a highly# efficient mechanism to store and retrieve data. Depending on the specific# RDBMS product, the type of computer (and many other details) a relational# database management system may able to read and write thousands or even# millions or billions of data values per second. To achieve this level of# efficiency often requires these products to integrate tightly with the# underlying computer in technical ways that go beyond what many other software# products require. Therefore installing and getting these products to work# correctly can sometimes be a little complex and require more than just running# a simple install program.## Relational Database Management Systems differ from programs such as Excel that# are designed for people to interact with them directly. Relational Database# Management System (RDBMS) software products are designed to allow other# computer programs to store and retrieve data from them. For example a program# written in R (or any other language) can store and retrieve data from an# RDBMS. These other computer programs (such as an R program) are in turn used# by business people to do their work. Typically, data technologists (e.g.# programmers data analysts, etc) write the programs (e.g. in R) that are used# by business people (e.g. traders, accountants). The data technologists use SQL# and R (or another programming language) to create easy to use tools for the# business people. In our class we will learn how to write R programs to send# and retrieve data from an RDBMS.#--------------------------------------------------------------------------# CLIENTS AND SERVERS## - client programs# - server programs## - client computers# - server computers#--------------------------------------------------------------------------# An RDBMS software program is known as a "server program" (it "serves up"# information). The computer programs that send and retrieve data from the RDBMS# program are known as "client programs". In a similar way, the programs that # run websites are known as "(web) server programs" (they also serve up information)# while web browsers are known as "client programs".# Most relational database management systems are intended to be used# "remotely". In other words, the relational database management system is# installed one computer and the computer programs that send and retrieve data# from the RDBMS are located on other computers. The computer that the RDBMS is# installed on is known as a "server computer" (the "server program" is# installed on the "server computer"). A computer that has software installed on# it that interacts with the RDBMS is known as a "client computer" (a "client# program" is installed on a "client computer"). The client program and server# program may communicate over the Internet or over a local computer network# (e.g. a network of computers in a company that is not accessible from the# larger Internet).## It is also possible for client programs and server programs to be# installed on the same computer. The details as to which computer each # program is installed on (whether the same computer or different computers)# is a detail that should not affect the way client programs are written.# Often when creating a new program (e.g. an R program) the developer has # both the server program (e.g. the RDBMS software) and the client program# (e.g. the program they are creating) installed on their personal computer.# When the program is eventually used by business users, the program could # then be configured to communicate with a different server computer. The# specifics of how exactly this is done is beyond the scope of what we are learning# in this class. ##--------------------------------------------------------------------# Different RDBMS software packages#--------------------------------------------------------------------# Some popular RDBMS packages that can be downloaded and installed on your# computer for free are:## - MySQL available for Windows/Mac/Linux https://www.mysql.com/## The installation of MySql can be somewhat complex. There are multiple# different products to choose between and may require "administrator privileges"# to install correctly. Nevertheless, it should be possible for you to install# and use MySql without too much hassle. However, the complications in my mind# do not make this an ideal solution for use in a classroom setting.## - PostgreSQL available for Windows/Mac/Linux https://www.postgresql.org/## - Microsoft SQL Server Express # available for Windows/Linux (not for Mac)# https://www.microsoft.com/en-us/sql-server/sql-server-downloads# NOTE: There are many different versions of "Microsoft SQL Server" software# the "Express" version is the "free" version.## For Windows users, this is a wonderful product that is much easier to use# than MySql. However, Microsoft SQL Server is NOT available "out of the box" for Mac. # While, you can find information online about how to install SQL Server on Macs,# this requires using Docker software - which itself has a learning curve.# If you own a mac and want to try it anyway# you can see this page - https://database.guide/how-to-install-sql-server-on-a-mac/# or search online for many other websites that have similar information.### - SQLite (https://www.sqlite.org/)## This is a "simple" relational database management system. It is designed# to be used only "locally", not over a computer network. In other words,# the client program (e.g. an R program) that interacts with the RDBMS is# expected to be installed on the same computer as the SQLite software. This# software lacks many of the more sophisticated features found in other# products. SQLite is not designed to be used with tremendous amounts of# information. However, SQLite does allow programs that communicate via SQL# to store their information in a database without too much hassle. This# product is often used by computer programs and apps that don't require# tremendous amounts of data to store and retrieve data in a relational# database.### DIFFERENT OPTIONS TO INSTALL SQLite## Installation of SQLite is relatively straight forward. The entire SQLite# program is contained in a single "executable file" that can be downloaded# from the SQLite website. This software does NOT have a "graphical user# interface (GUI)". Rather the program presents a simple "command line"# interface - similar to R - in which you can type commands and see the# results.## In addition to the "plain vanilla" SQLite that is distributed by# sqlite.org, there are other add-on programs that you can download to# provide a more "friendly" GUI user interface. One such program is "SQLite# Studio" (https://sqlitestudio.pl/). The latest version of SQLite Studio# also does not come with an installation program. Just a zip file that# contains all the files that the program needs to run. To use this program,# unzip the downloaded file and run the program that is in the file named# SQLiteStudio.exe.## To use SQLite from within R does not require downloading any files from# the SQLite website or any other website. Rather the RSQLite package in R# has all of the functionality needed to use SQLite built into it. In our# class we will be using the RSQLite package to interface with the SQLite# database.#--------------------------------------------------------------------.#--------------------------------------------------------------------.# sqlfiddle.com#--------------------------------------------------------------------.# sqlfiddle.com - you can use this website to try different flavors of sql.# You can choose the flavor of SQL to use from a dropdown list, (e.g.# MySQL, PostgreSQL, SQLite, Oracle)## On the left side of the sqlfiddle.com screen you create tables and # and insert and modify data by using CREATE TABLE, INSERT, UPDATE and DELETE# SQL statements.## On the right side of the sqlfiddle.com screen you can enter # SELECT statements that manipulate the data.## EXAMPLE - click on the following URL:# http://sqlfiddle.com/#!17/8efb57/8# To see the following database tables and sample data running in the# sqlfiddle.com website.# # +---------------------+ +----------------------+# | venues | | parties |# +---------------------+ +----------------------+# | PK venueId | | PK partyId |# | | | |# | venueName | | FK venueId | # | | | /| |# | city +-+---------+ numGuests |# | | | \| |# | state | | cuisine |# | | | |# | kosherKitchen | | customerFirstName|# | | | |# | | | customerLastName |# | | | |# | | | partyDate |# | | | |# +---------------------+ | durationInHours | # +----------------------+ # \|/# |# |# |# ---# |# +---------------------+ +----------------------+# | waiters | | parties_waiters |# +---------------------+ +----------------------+# | PK waiterId | | PK,FK1 partyId |# | | | |# | firstName | | PK,FK2 waiterId | # | | | /| |# | lastName +-+---------+ hourlyWage |# | | | \| |# | gender | | |# | | | |# +---------------------+ +----------------------+##--------------------------------------------------------------------.#----------------------------------------------------------------------------.# DDL vs DML statements## SQL statements that create or modify the structure of the database # tables are known collectively as Data Definition Language (DDL) SQL statements. # These include commands such as ## CREATE TABLE (that creates the table structures),# DROP TABLE (that destroys an entire table, including the structure), # ALTER TABLE (that alters or changes the structure of tables, e.g. add columns)# other similar commands ## SQL statements such as SELECT, INSERT, UPDATE and DELETE that create, modify# retrieve and delte "data" in the tables are collectively known as # Data Manipulation Language (DML) SQL statements.### NOTE: (not very important, but may clarify something ...)## sqlfiddle.com instructs you to put DDL statements in the left hand side# of the window and SELECT statements in the right hand side. The website# (incorrectly, IMHO) refers to INSERT, UPDATE and DELETE statemens as DDL # statements. I believe most people in the industry refer to these as DML# statements.#----------------------------------------------------------------------------.#-----------------------------------------------------------------------.# More about SQLite databases.## SQLite is designed to be used without the need for a server program.# All the data in a SQLite database is packed into a single file.# To connect to a SQLite database, you need to know the path to the # SQLite file for the database.## The extension for a SQLite file is not standardized - sometimes# you will see .db sometimes .sqlite sometimes other extensions#-----------------------------------------------------------------------.#########################################################################.# Using DBI to connect to a database from R#------------------------------------------------------------------------.# To connect to a database from R, you can use functions from the # DBI package. You will also need to use an additional package that is# specific for the type of database software you are going to connect to. ## For example:# - to connect to a sqlite database you will need to use# functions from both the DBI package, as well as the RSQLite package. ## - to connect to a PostgreSQL database you will need to use# functions from both the DBI package, as well as the RPostgreSQL package. ## Fortunately, the code to write for both types of database software products# as well as many other database software products is VERY similar.## Below, we will demonstrate how to connect to both a sqlite database and a# PostgreSQL database using the DBI package. ## NOTE: There is an alternative to the DBI approach described above.# Another popular alternative to connect to a database from R is to # use the RODBC package. You can research that on your own if # you like. The approach we take here with the DBI package# is just as powerful as ODBC - just an alternative. They are both# popular approaches. #########################################################################.#------------------------------------------------------------------------.# How to connect to a sqlite database using the DBI package.#------------------------------------------------------------------------.# Use R's DBI package to establish a "connection" with a database.# You also need to use a package for the particular database software # you are going to be connecting with. For example, the RSQLite package# is used for the sqlite database.## The DBI package contains a function called dbConnect. # The DBMS specific packages, such as RSQLite also contain a dbConnect function.# The DBMS specific version of the dbConnect function, calls the DBI version# of the dbConnect function as part of its processing.## The RSQLite package also provides the SQLite() function. This function# returns a "driver" object that knows the specifics of how to work with# sqlite databases. ## The dbConnect function in the DBI package takes a "driver" as a first# argument. The other arguments to dbConnect depend on the type of# database software you are trying to connect to. ## See the code below for more info.#########################################################################.#---------------------------------------------------------------------.# Establish a connection to the SQLite database:#---------------------------------------------------------------------.#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~# You will need both the DBI and RSQLite packages#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~if(!require(DBI)) {install.packages("DBI"); require(DBI)}
help(package="DBI")help(package="RSQLite")#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~# Create a connection to the Books database#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~# this should be a path to the .sqlite file on your computersqliteBooksFile ="data/books.sqlite"# Create the connection variable, conBooks (you can name this anything you want)# You will use conBooks whenever you want to access the database.# See below for more info.conBooks <-dbConnect(RSQLite::SQLite(), dbname = sqliteBooksFile)#---------------------------------------------------------------------.# Show the tables in the database#---------------------------------------------------------------------.dbListTables(conBooks)
#---------------------------------------------------------------------.# The following functions from the DBI package can be used to # run SQL statements in the database:## FOR SELECT STATEMENTS USE# dbGetQuery() or # use to get entire results all at once# dbSendQuery() # use for retrieving large amounts of info a few rows at a time# # FOR OTHER SQL STATEMENTS USE # dbExecute() or dbSendStatement() #---------------------------------------------------------------------.#......................................# Get everything all at once#......................................dbGetQuery(conBooks, "select * from authors")
au_id au_fname au_lname phone address city state zip
1 A01 Sarah Buchman 718-496-7223 75 West 205 St Bronx NY 10468
2 A02 Wendy Heydemark 303-986-7020 2922 Baseline Rd Boulder CO 80303
3 A03 Hallie Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco CA 94123
4 A04 Klee Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco CA 94123
5 A05 Christian Kells 212-771-4680 114 Horatio St New York NY 10014
6 A06 Harvey Kellsey 650-836-7128 390 Serra Mall Palo Alto CA 94305
7 A07 Paddy O'Furniture 941-925-0752 1442 Main St Sarasota FL 34236
dbGetQuery(conBooks, "select * from titles where type='biography'")
title_id title_name type pub_id pages price sales pubdate
1 T06 How About Never? biography P01 473 19.95 11320 2000-07-31 00:00:00
2 T07 I Blame My Mother biography P03 333 23.95 1500200 1999-10-01 00:00:00
3 T10 Not Without My Faberge Egg biography P05
4 T12 Spontaneous, Not Annoying biography P05 507 12.99 100001 2000-08-31 00:00:00
#.................................................# To get large amounts of data a few rows at a time use the following functions:# # # Call dbSendQuery to get a "handle"## RESULT_HANDLE = dbSendQuery( conn = CONNECTION ,# statement = SOME_SELECT_STATEMENT )## # Call dbFetch multiple times.# # Each time you get back a dataframe with a few rows.# # You could also call dbFetch in a loop## ROWS = dbFetch( res = RESULT_HANDLE , n = NUMBER_OF_ROWS)# MORE_ROWS = dbFetch( res = RESULT_HANDLE , n = NUMBER_OF_ROWS)# MORE_ROWS = dbFetch( res = RESULT_HANDLE , n = NUMBER_OF_ROWS)# MORE_ROWS = dbFetch( res = RESULT_HANDLE , n = NUMBER_OF_ROWS)# etc. ## # It is a very good idea to call dbClearResult when you are done processing# # all the rows. That will free up any computer memory that was allocated# # to processing the results. Also most DBMS products have a limited number# # of queries that can be processed simultaneously. dbClearResult# # will free up the resources used by this query so that there are # # more resources available in the database for others who might be trying# # to access the database simultaneously.## dbClearResult( RESULT_HANDLE )#.................................................# Get the rows from the titles table 5 at a time.# You can see a summary of where you're up to in the entire result set # by displaying the results variable.results <-dbSendQuery(conBooks, "select * from titles")dbFetch(results, n=5) # get first 5 rows
title_id title_name type pub_id pages price sales pubdate
1 T01 1977! history P01 107 21.99 566 2000-08-01 00:00:00
2 T02 200 Years of German Humor history P03 14 19.95 9566 1998-04-01 00:00:00
3 T03 Ask Your System Administrator computer P02 1226 39.95 25667 2000-09-01 00:00:00
4 T04 But I Did It Unconsciously psychology P01 510 12.99 13001 1999-05-31 00:00:00
5 T05 Exchange of Platitudes psychology P01 201 6.95 201440 2001-01-01 00:00:00
title_id title_name type pub_id pages price sales pubdate
1 T06 How About Never? biography P01 473 19.95 11320 2000-07-31 00:00:00
2 T07 I Blame My Mother biography P03 333 23.95 1500200 1999-10-01 00:00:00
3 T08 Just Wait Until After School children P01 86 10.00 4095 2001-06-01 00:00:00
4 T09 Kiss My Boo-Boo children P01 22 13.95 5000 2002-05-31 00:00:00
5 T10 Not Without My Faberge Egg biography P05
dbClearResult(results) # we're done so free up any "resources" (e.g. memory or connections to database)# Let's do that again ...# You can also use the following functions to return specific info about the # results, by passing the function the RESULTS_HANDLE that you got back from # the dbSendQuery function.## dbGetRowCount(RESULTS_HANDLE) # return the # of rows returned so far# dbHasCompleted(RESULTS_HANDLE) # FALSE if more rows to be retrieved, TRUE otherwise## There are other functions that could be used with the RESULTS_HANDLE too.# See the documentation for more info.results <-dbSendQuery(conBooks, "select * from titles")dbFetch(results, n=5) # get first 5 rows
title_id title_name type pub_id pages price sales pubdate
1 T01 1977! history P01 107 21.99 566 2000-08-01 00:00:00
2 T02 200 Years of German Humor history P03 14 19.95 9566 1998-04-01 00:00:00
3 T03 Ask Your System Administrator computer P02 1226 39.95 25667 2000-09-01 00:00:00
4 T04 But I Did It Unconsciously psychology P01 510 12.99 13001 1999-05-31 00:00:00
5 T05 Exchange of Platitudes psychology P01 201 6.95 201440 2001-01-01 00:00:00
dbGetRowCount(results) # 5
[1] 5
dbHasCompleted(results) # FALSE
[1] FALSE
dbFetch(results, n=5) # get the next 5 rows
title_id title_name type pub_id pages price sales pubdate
1 T06 How About Never? biography P01 473 19.95 11320 2000-07-31 00:00:00
2 T07 I Blame My Mother biography P03 333 23.95 1500200 1999-10-01 00:00:00
3 T08 Just Wait Until After School children P01 86 10.00 4095 2001-06-01 00:00:00
4 T09 Kiss My Boo-Boo children P01 22 13.95 5000 2002-05-31 00:00:00
5 T10 Not Without My Faberge Egg biography P05
dbGetRowCount(results) # 10
[1] 10
dbHasCompleted(results) # FALSE
[1] FALSE
dbFetch(results, n=5) # try to get the next 5 rows (but there are only 3 more)
title_id title_name type pub_id pages price sales pubdate
1 T11 Perhaps It's a Glandular Problem psychology P05 826 7.99 94123 2000-11-30 00:00:00
2 T12 Spontaneous, Not Annoying biography P05 507 12.99 100001 2000-08-31 00:00:00
3 T13 What Are The Civilian Applications? history P03 802 29.99 10467 1999-05-31 00:00:00
dbGetRowCount(results) # 13
[1] 13
dbHasCompleted(results) # TRUE
[1] TRUE
dbClearResult(results) # we're done so free up any "resources" (e.g. memory or connections to database)# You can do that in a loop to get back 5 rows at a time, # do some processing and then continue.# Start the query and get the first few rows.results <-dbSendQuery(conBooks, "select * from titles")rows =dbFetch(results, n=5) # get first 5 rows# Keep looping until we get all of the rows.while ( dbHasCompleted(results) ==FALSE ){ newRows =dbFetch(results, n=5) # get first 5 rows rows =rbind(rows, newRows)}# We're done getting all the rows so clear the results handle.# This releases any memory or other database "resources" that were being used# to process the results.dbClearResult(results) # These are all the rowsrows
title_id title_name type pub_id pages price sales pubdate
1 T01 1977! history P01 107 21.99 566 2000-08-01 00:00:00
2 T02 200 Years of German Humor history P03 14 19.95 9566 1998-04-01 00:00:00
3 T03 Ask Your System Administrator computer P02 1226 39.95 25667 2000-09-01 00:00:00
4 T04 But I Did It Unconsciously psychology P01 510 12.99 13001 1999-05-31 00:00:00
5 T05 Exchange of Platitudes psychology P01 201 6.95 201440 2001-01-01 00:00:00
6 T06 How About Never? biography P01 473 19.95 11320 2000-07-31 00:00:00
7 T07 I Blame My Mother biography P03 333 23.95 1500200 1999-10-01 00:00:00
8 T08 Just Wait Until After School children P01 86 10.00 4095 2001-06-01 00:00:00
9 T09 Kiss My Boo-Boo children P01 22 13.95 5000 2002-05-31 00:00:00
10 T10 Not Without My Faberge Egg biography P05
11 T11 Perhaps It's a Glandular Problem psychology P05 826 7.99 94123 2000-11-30 00:00:00
12 T12 Spontaneous, Not Annoying biography P05 507 12.99 100001 2000-08-31 00:00:00
13 T13 What Are The Civilian Applications? history P03 802 29.99 10467 1999-05-31 00:00:00
# The default for dbFetch is to retrieve ALL of the records.# In the following examples all of the records are retrieved at once.results <-dbSendQuery(conBooks, "select * from titles where type='biography'")results # This is NOT the actual data - it is a "handle" that you need to use with other functions
<SQLiteResult>
SQL select * from titles
where type='biography'
ROWS Fetched: 0 [incomplete]
Changed: 0
dbFetch(results) # get all of the data
title_id title_name type pub_id pages price sales pubdate
1 T06 How About Never? biography P01 473 19.95 11320 2000-07-31 00:00:00
2 T07 I Blame My Mother biography P03 333 23.95 1500200 1999-10-01 00:00:00
3 T10 Not Without My Faberge Egg biography P05
4 T12 Spontaneous, Not Annoying biography P05 507 12.99 100001 2000-08-31 00:00:00
dbClearResult(results) # clear the computer memory/resources after you're finished# You can also retrieve the rest of the rows that haven't been retrieved yet.results <-dbSendQuery(conBooks, "select * from authors")dbFetch(results, n=2) # get the first 2 rows
au_id au_fname au_lname phone address city state zip
1 A01 Sarah Buchman 718-496-7223 75 West 205 St Bronx NY 10468
2 A02 Wendy Heydemark 303-986-7020 2922 Baseline Rd Boulder CO 80303
dbFetch(results, n=-1) # get the rest of the rows
au_id au_fname au_lname phone address city state zip
1 A03 Hallie Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco CA 94123
2 A04 Klee Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco CA 94123
3 A05 Christian Kells 212-771-4680 114 Horatio St New York NY 10014
4 A06 Harvey Kellsey 650-836-7128 390 Serra Mall Palo Alto CA 94305
5 A07 Paddy O'Furniture 941-925-0752 1442 Main St Sarasota FL 34236
dbClearResult(results) # clear the computer memory/resources after you're finisheddbListTables(conBooks) # packages DBI, odbc
#---------------------------------------------------------------------.# Connecting to a PostgreSQL database## See the info here: https://hevodata.com/learn/rpostgresql/#---------------------------------------------------------------------.# Many RDBMS (Relational Database Management System) software products# are designed for use with massive amounts of data. They are# often more complicated software products that have many options for optimizing# how they work. These types of RDBMS products are often run on a separate# computer. PostgreSQL is such a product. Connecting to it requires# knowledge of the following info:## - The name of the database. A database is a collection of tables# The same PostgreSQL software can manage several different databases.## - The "name" of the computer that is running the PostgreSQL software.# This could be in the form of## * a "domain name", e.g. somedomain.com,# * an IP address, e.g. 192.168.15.1# * "domain name", e.g. somedomain.com,# # Every website on the internet is running on a computer. The # domain name ##dsn_database ="catering"# Specify the name of your Database# Specify host name e.g.:"aws-us-east-1-portal.4.dblayer.com"# 127.0.0.1 is an IP address that refers to your computerdsn_hostname ="127.0.0.1"# see: https://dba.stackexchange.com/questions/41458/changing-postgresql-port-using-command-line# or: https://stackoverflow.com/questions/15100368/postgresql-port-confusion-5433-or-5432## find the file "postgresql.conf" # on my Windows computer it is in the following folder:# \Program Files\PostgreSQL\15\data# There should be a line in that file that says: port = SOME_NUMBERdsn_port ="5432"# Specify your port number. e.g. 98939#dsn_port = "5050" # Specify your port number. e.g. 98939#dsn_port = "65335" # Specify your port number. e.g. 98939# this is the default user iddsn_uid ="postgres"# Specify your username. e.g. "admin"#dsn_uid = "pgadmin4" # Specify your username. e.g. "admin"#dsn_uid = "pgadmin4" # Specify your username. e.g. "admin"# this is whatever you set your password todsn_pwd ="password"# Specify your password. e.g. "xxx"## On March 30, 2023,# I got the following error when trying to connect R # to postgres 15.2 on Windows 10 Home:## SCRAM authentication requires libpq version 10 or above## This webpage:# https://stackoverflow.com/questions/62807717/how-can-i-solve-postgresql-scram-authentication-problem# says the following:## > Your application uses an API that is linked with the PostgreSQL client C library libpq.# > The version of that library must be 9.6 or older, and SCRAM authentication was introduced in v10.# > Upgrade libpq on the application end and try again.# > # > If you don't need scram-sha-256 authentication, you can revert to md5:# > # > * set password_encryption = md5 in postgresql.conf# > * change the authentication method to md5 in pg_hba.conf# > * reload PostgreSQL# > * change the password of the user to get an MD5 encrypted password## I followed the 2nd suggestion - i.e. # > * set password_encryption = md5 in postgresql.conf# > * change the authentication method to md5 in pg_hba.conf# > * reload PostgreSQL# > * change the password of the user to get an MD5 encrypted password## -YR #---------------------------------------------------------------------.# You will need both the DBI and RPostgreSQL packagesif(!require(DBI)) {install.packages("DBI"); require(DBI)}if(!require(RPostgreSQL)) {install.packages("RPostgreSQL"); require(RPostgreSQL)}
Loading required package: RPostgreSQL
help(package="DBI")help(package="RPostgreSQL")dsn_database ="catering"# Specify the name of your Database# Specify host name e.g.:"aws-us-east-1-portal.4.dblayer.com"# 127.0.0.1 is an IP address that refers to your computerdsn_hostname ="127.0.0.1"# see: https://dba.stackexchange.com/questions/41458/changing-postgresql-port-using-command-line# or: https://stackoverflow.com/questions/15100368/postgresql-port-confusion-5433-or-5432## find the file "postgresql.conf" # on my Windows computer it is in the following folder:# \Program Files\PostgreSQL\15\data# There should be a line in that file that says: port = SOME_NUMBERdsn_port ="5432"# Specify your port number. e.g. 98939#dsn_port = "5050" # Specify your port number. e.g. 98939#dsn_port = "65335" # Specify your port number. e.g. 98939# this is the default user iddsn_uid ="postgres"# Specify your username. e.g. "admin"#dsn_uid = "pgadmin4" # Specify your username. e.g. "admin"#dsn_uid = "pgadmin4" # Specify your username. e.g. "admin"# this is whatever you set your password todsn_pwd ="password"# Specify your password. e.g. "xxx"# Call dbConnect to connect to the databasetryCatch({ drv <-dbDriver("PostgreSQL") print("Connecting to Database…") conCatering <-dbConnect(drv, dbname = dsn_database,host = dsn_hostname, port = dsn_port,user = dsn_uid, password = dsn_pwd)print("Database Connected!")},error=function(cond) {print("Unable to connect to Database.")print(cond)})
[1] "Connecting to Database…"
[1] "Database Connected!"
# You can now use any of the DBI functions by passing the connection# object for the catering database.# Use dbGetQuery to get instant results from select statements.dbGetQuery(conCatering, "select * from waiters")
waiterid firstname lastname gender
1 w01 steve cohen m
2 w02 larry jones m
3 w03 clara jones f
4 w04 julie cohen f
5 w05 frank smith m
6 w06 bob cohen m
# Use dbSendQuery, dbFetch, dbClearResult to get only some rows at a time.results <-dbSendQuery(conCatering, "select * from parties") dbFetch(results, n=2) # get first two rows
partyid venueid numguests cuisine customerfirstname customerlastname partydate durationinhours
1 p01 v01 300 italian al gindy 2019-01-01 3
2 p02 v01 200 italian al gindy 2020-01-01 2
dbFetch(results) # get the rest of the rows
partyid venueid numguests cuisine customerfirstname customerlastname partydate durationinhours
3 p03 v01 175 syrian moshe gindy 2020-02-01 2
4 p04 v01 100 italian ralph jones 2020-03-01 3
5 p05 v02 175 italian gavriel cohen 2020-04-01 1
6 p06 v02 125 chinese yoel schwartz 2020-05-01 1
7 p07 v03 75 <NA> mike smith 2020-06-01 3
8 p08 v05 50 chinese larry cohen 2019-07-01 4
9 p09 v05 50 italian mike schatz 2020-07-01 4
10 p10 v05 50 chinese ping sez 2020-08-01 4
11 p99 v05 100 <NA> yitz rosenthal <NA> NA
dbClearResult(results) # clear the computer memory/resources after you're finished
[1] TRUE
# Use dbListTables to see the tables in the database# Use dbExecute to run other SQL commands - eg. create or drop a tabledbListTables(conCatering)