Raw Data Import

Introduction

The aim of this document is to outline the basic workflow of importing data downloaded from the ICES Regional Database & Estimation System (RDBES) or a list object containing data frames (or data.tables) into R using the RDBEScore package.

The function createRDBESDataObject is intended to directly import Commercial Landing (CL), Commercial Effort (CE) and Commercial Sampling (CS) tables downloaded from RDBES.

Load the package

library(RDBEScore)

Importing zipped files

It can directly import the .zip archive from the RDBES download containing all mandatory hierarchy tables plus VD and SL:

importedH1 <- createRDBESDataObject(input = "./vignetteData/H1_2023_10_16.zip")
#print the not NULL table names
names(importedH1[!unlist(lapply(importedH1, is.null))])
#>  [1] "DE" "SD" "VS" "FT" "FO" "SS" "SA" "FM" "BV" "VD" "SL"

The easiest way to get a glimpse of the imported data hierarchy and single table row counts is just to print it. The information also includes the range of number sampled and number total for each table together with the selection method and number of rows.

#calls the print function 
importedH1
#> Hierarchy 1 RDBESdataObject:
#>  DE: 8
#>  SD: 8
#>  VS: 1214 (SRSWOR,CENSUS,SRSWR: 2-135/4-1382)
#>  FT: 1430 (CENSUS,SRSWR: 1-3/1-100)
#>  FO: 1916 (CENSUS,SRSWR: 1-3/1-20)
#>  SS: 1916 (CENSUS,SRSWR: 1/1-4)
#>  SA: 1916 (CENSUS,SRSWR: 1/1-2)
#>  FM: 7290
#>  BV: 14580 (SRSWR: 2/2)
#>  VD: 311
#>  SL: 170

It can import the CL, CE, VD or SL tables .zip archives, but will include all other tables as NULL:

importedSL <- createRDBESDataObject(input = "./vignetteData/HSL_2023_10_16.zip")
#print the not NULL table names
importedSL
#> No hierarchy, RDBESdataObject:
#>  SL: 170

It can also handle overwriting zip file original files with files appearing later in the list. However each overwrite results in a warning!

importFiles <- c("./vignetteData/HSL_2023_10_16.zip",
                 "./vignetteData/H1_2023_10_16.zip")
importedTables <- createRDBESDataObject(input = importFiles)
#> Warning in FUN(X[[i]], ...): Duplicate unzipped files detected:
#>  SpeciesList.csv
#print the not NULL table names
names(importedTables[!unlist(lapply(importedTables, is.null))])
#>  [1] "DE" "SD" "VS" "FT" "FO" "SS" "SA" "FM" "BV" "VD" "SL"

Importing csv files

It can also import the unzipped .csv files with the default RDBES names:

importedVS <- createRDBESDataObject(input = "./vignetteData/",
                                    listOfFileNames = list("VS" = "VesselSelection.csv"))
#print the not NULL table names
names(importedTables[!unlist(lapply(importedTables, is.null))])
#>  [1] "DE" "SD" "VS" "FT" "FO" "SS" "SA" "FM" "BV" "VD" "SL"

Importing list of data frames

It can also import a list object containing data frames (or data.tables). However, it should be noted that this type of import bypasses the RDBES upload data integrity checks.

#list of data frames 
listOfDfsH1 <- readRDS("./vignetteData/H1_2023_10_19.rds")
#print the class of the list elements 
sapply(listOfDfsH1, class)
#>           DE           SD           VS           FT           FO           SS 
#> "data.frame" "data.frame" "data.frame" "data.frame" "data.frame" "data.frame" 
#>           SA           FM           BV           VD           SL 
#> "data.frame" "data.frame" "data.frame" "data.frame" "data.frame"

importedList <- createRDBESDataObject(listOfDfsH1)
#> Warning in createRDBESDataObject(listOfDfsH1): NOTE: Creating RDBES data objects from a list of local data frames bypasses the RDBES upload data integrity checks.

Object class RDBESDataObject

It should be noted that the objects created are of the S3 class “RDBESDataObject”. The class has defined print(), summary() and sort() methods. For more info on theese see vignette Manipulating RDBESDataObjects.

importedTables <- createRDBESDataObject("./vignetteData/H1_2023_10_16.zip")
class(importedTables)
#> [1] "RDBESDataObject" "list"

validate RDBESDataObject

RDBESDataObject structure can be validated using the validateRDBESDataObject() function.

validateRDBESDataObject(importedTables, verbose = TRUE)
#> [1] "Note that TE is NULL but this is allowed in an RDBESDataObject"
#> [2] "Note that LO is NULL but this is allowed in an RDBESDataObject"
#> [3] "Note that OS is NULL but this is allowed in an RDBESDataObject"
#> [4] "Note that LE is NULL but this is allowed in an RDBESDataObject"
#> [5] "Note that CL is NULL but this is allowed in an RDBESDataObject"
#> [6] "Note that CE is NULL but this is allowed in an RDBESDataObject"

To see what you can do with the imported RDBESDataObject see other vignettes like Manipulating RDBESDataObjects.

Other vignettes: