Skip to contents

What is nga_cbn()?

nga_cbn() is the primary data retrieval function in opennaijR. It connects your R session directly to the Central Bank of Nigeria (CBN) data infrastructure, handles all cleaning and formatting automatically, and delivers an analysis-ready object – all in one call.

The function name follows the opennaijR convention for all African data sources:

country_code  +  institution_code  +  ()
     nga       +       cbn         +  ()  =  nga_cbn()

When opennaijR expands to other countries, the same pattern applies consistently:

gha_bog()   -- Ghana, Bank of Ghana
ken_cbk()   -- Kenya, Central Bank of Kenya
zaf_sarb()  -- South Africa, South African Reserve Bank
eth_nbe()   -- Ethiopia, National Bank of Ethiopia

A user who learns nga_cbn() already understands every future function.


Arguments at a Glance

Argument Type Default What it does
dataset character required Dataset key or alias from discover_datasets()
variables character vector NULL Specific indicator codes to retrieve. NULL returns all
from character / Date NULL Start date in "YYYY-MM-DD" format
to character / Date today End date in "YYYY-MM-DD" format
filters list list() Advanced SDMX filter criteria
raw logical FALSE If TRUE, returns the uncleaned API response
canonical logical TRUE If TRUE, applies opennaijR standard cleaning and naming
refresh logical FALSE If TRUE, bypasses the local cache and downloads fresh data
auto.assign logical TRUE If TRUE, assigns result to your environment automatically
env environment .GlobalEnv Where to assign the object when auto.assign = TRUE
quiet logical TRUE If TRUE, suppresses console messages during fetch

Part 1 – The Two Retrieval Modes

nga_cbn() works in two distinct modes. Understanding the difference is the most important thing to learn about this function.


Mode 1 – Auto-assign (the quantmod way)

library(opennaijR)

nga_cbn("inflation")

head(inflation)

Notice there is no assignment operator. You did not write inflation <- nga_cbn("inflation").

Instead, nga_cbn() silently created a variable called inflation in your global environment and placed the data there. The function returned the string "inflation" invisibly – just the name, not the data.

This mirrors how quantmod::getSymbols() works, and it is the default behaviour (auto.assign = TRUE).

What you get in your environment:

ls()
#> [1] "inflation"

class(inflation)
#> [1] "opennaijR_tbl" "data.frame"

head(inflation[, c("date", "headline_yoy", "food_yoy")])
#>         date headline_yoy food_yoy
#> 1 2010-01-01        11.80    14.60
#> 2 2010-02-01        14.05    17.02

Mode 2 – Explicit assignment (the standard R way)

df <- nga_cbn("exchange_rates", auto.assign = FALSE)

head(df)

Here auto.assign = FALSE tells the function to return the data frame directly so you can assign it to any name you choose.

When to use each mode:

Situation Use
Interactive analysis at the top level auto.assign = TRUE (default)
You want a custom variable name auto.assign = FALSE
Inside a function you are writing auto.assign = FALSE
Inside lapply() or a loop that returns results auto.assign = FALSE
Building a pipeline with |> auto.assign = FALSE

Important: Never use auto.assign = TRUE inside a function body. It writes to .GlobalEnv silently, which produces hard-to-trace side effects. Always use auto.assign = FALSE in non-interactive code.


Part 2 – Loading Multiple Datasets

This is where auto.assign = TRUE becomes genuinely powerful. You can retrieve several datasets in a loop without writing a separate assignment for each one.

datasets <- c("inflation", "exchange_rates", "crude_oil")

for (key in datasets) {
  nga_cbn(key)
}

# All three now exist as named objects in your environment
ls()
#> [1] "crude_oil"      "exchange_rates" "inflation"

head(inflation)
head(exchange_rates)
head(crude_oil)

Each dataset key becomes its own variable name automatically. A researcher building a macro model can load their full data universe in three lines.

You can also use lapply() for the same result:

invisible(lapply(datasets, nga_cbn))

invisible() suppresses the character vector of names that lapply() would otherwise print to the console.


Part 3 – Date Filtering

Use from and to to retrieve only the period you need. Both arguments accept either a character string in "YYYY-MM-DD" format or an R Date object.

Retrieve from a specific start date

# All inflation data from January 2015 onwards
recent_infl <- nga_cbn(
  "inflation",
  from        = "2015-01-01",
  auto.assign = FALSE
)

Retrieve a specific window

# The 2016 recession period only
recession <- nga_cbn(
  "inflation",
  from        = "2015-01-01",
  to          = "2018-12-31",
  auto.assign = FALSE
)

Retrieve the most recent 12 months

# Dynamic: always retrieves the last 12 months regardless of when you run it
nga_cbn(
  "exchange_rates",
  from = as.character(Sys.Date() - 365),
  to   = as.character(Sys.Date())
)

Part 4 – Filtering Specific Variables

By default, nga_cbn() returns every indicator in a dataset. Use variables to retrieve only the columns you need. This is useful when a dataset is wide and you only care about one or two series.

# Return only headline and food inflation -- skip all other columns
infl_narrow <- nga_cbn(
  "inflation",
  variables   = c("headline_yoy", "food_yoy"),
  auto.assign = FALSE
)

names(infl_narrow)
#> [1] "date"         "headline_yoy" "food_yoy"

Part 5 – Cache Management with refresh

opennaijR caches every download locally using the pins package. After the first call, subsequent calls to the same dataset load from disk and are nearly instant – no network request is made.

# First call -- downloads from CBN (takes a moment)
nga_cbn("inflation")

# Second call -- loads from local cache (instant)
nga_cbn("inflation")

When to use refresh = TRUE

The CBN periodically revises historical data and publishes new monthly figures. Your local cache does not update automatically. Use refresh = TRUE when:

  • A new month of data has been published
  • You suspect the CBN has revised historical figures
  • You want to verify your results against the latest official release
# Bypass the cache and download the latest data
nga_cbn("inflation", refresh = TRUE)
# Refresh and capture explicitly
infl_latest <- nga_cbn(
  "inflation",
  refresh     = TRUE,
  auto.assign = FALSE
)

Part 6 – Raw vs Canonical Output

Canonical output (default)

infl_clean <- nga_cbn(
  "inflation",
  canonical   = TRUE,    # this is the default
  auto.assign = FALSE
)

names(infl_clean)
#> [1] "date"             "headline_yoy"     "food_yoy"
#>     "core_ex_farm_yoy" ...

canonical = TRUE applies the full opennaijR cleaning pipeline:

  • Column names are normalised to snake_case
  • Date strings are parsed into R Date objects
  • The dataset is reshaped to tidy format
  • The object carries opennaijR_tbl class and source attributes

This is the output that apply_projection() and derive_measure() expect.

Raw output

infl_raw <- nga_cbn(
  "inflation",
  raw         = TRUE,
  auto.assign = FALSE
)

# Inspect what the CBN API actually returns before any cleaning
str(infl_raw)

raw = TRUE returns the API response exactly as received – messy column names, string dates, and all. Use this when:

  • You are debugging a connector
  • You want to inspect what the CBN actually publishes before cleaning
  • You are building a custom parser and need the source structure

Do not use raw = TRUE for analysis. The canonical output is always preferable for any downstream work.


Part 7 – Controlling the Assignment Environment

By default, auto.assign = TRUE writes to .GlobalEnv. You can redirect this to a different environment using env.

# Create a dedicated environment for macro data
macro_env <- new.env(parent = emptyenv())

# Load datasets into that environment instead of .GlobalEnv
nga_cbn("inflation",     env = macro_env)
nga_cbn("exchange_rates", env = macro_env)

# Access them through the environment
ls(macro_env)
#> [1] "exchange_rates" "inflation"

head(macro_env$inflation)

This keeps your global environment clean when working with many datasets, and is the correct pattern for package development or Shiny applications where polluting .GlobalEnv is unacceptable.


Part 8 – Full Pipeline Integration

nga_cbn() with auto.assign = FALSE slots directly into a pipeline:

library(opennaijR)

nga_cbn("inflation", auto.assign = FALSE) |>
  apply_projection(
    cols   = c("date", "headline_yoy", "food_yoy"),
    rename = c(Date = "date", Headline = "headline_yoy", Food = "food_yoy"),
    reason = "Select key series"
  ) |>
  derive_measure(
    gap         = Headline - Food,
    accelerating = Headline > lag(Headline),
    reason      = "Diagnostic indicators"
  ) |>
  plot_inflation_shocks(
    val_col = "Headline",
    title   = "Nigeria Headline Inflation"
  )

From raw CBN data to a fully annotated chart in four lines.


Part 9 – The Future Pattern

When opennaijR expands to other African countries, every new function will follow the same interface as nga_cbn(). The arguments, the caching behaviour, the auto.assign pattern, and the canonical output will all work identically. Only the function name changes.

# Ghana -- Bank of Ghana
gha_bog("inflation", from = "2015-01-01")

# Kenya -- Central Bank of Kenya
ken_cbk("exchange_rates", auto.assign = FALSE)

# South Africa -- South African Reserve Bank
zaf_sarb("inflation", refresh = TRUE)

# Load inflation data from four countries in one loop
sources <- list(
  nga = "nga_cbn",
  gha = "gha_bog",
  ken = "ken_cbk",
  zaf = "zaf_sarb"
)

for (fn in sources) {
  do.call(fn, list("inflation"))
}

A researcher doing a comparative African inflation study loads all four countries’ data with the same loop structure – no special cases, no different arguments, no surprises.


Quick Reference

# Load and auto-assign
nga_cbn("inflation")

# Load and assign yourself
df <- nga_cbn("inflation", auto.assign = FALSE)

# Load multiple datasets at once
invisible(lapply(c("inflation", "exchange_rates"), nga_cbn))

# Filter to a date range
nga_cbn("inflation", from = "2020-01-01", to = "2023-12-31")

# Filter to specific variables
nga_cbn("inflation", variables = c("headline_yoy", "food_yoy"),
        auto.assign = FALSE)

# Force fresh download
nga_cbn("inflation", refresh = TRUE)

# Get raw uncleaned API response
nga_cbn("inflation", raw = TRUE, auto.assign = FALSE)

# Load into a specific environment
nga_cbn("inflation", env = macro_env)

# Full pipeline
nga_cbn("inflation", auto.assign = FALSE) |>
  apply_projection(cols = c("date", "headline_yoy")) |>
  derive_measure(accel = headline_yoy - lag(headline_yoy))