New Business and financial analysts are using R everyday. Most of these new R user are coming from a non-programming background and their tool of choice is Excel. So let’s begin with, “How to do a VLOOKUP in R ?”

VLOOKUP is an Excel function to lookup and retrieve data from a specific column in table. “tidyquant” package in R is the solution for it.

Install & Load Packages :

install.packages("tidyquant")
install.packages("dplyr")
library(tidyquant)
library(dplyr)

The most popular Excel Reference Function, VLOOKUP, is now in R as VLOOKUP() in tidyquant package.

Beginning with creating 2 datasets :

First dataset of 5 customers with total numbers of calls done in each week for a month.

df1 <-data.frame(CustID= c(rep(1001,4),rep(1002,4),rep(1003,4),rep(1004,4),rep(1005,4)),
                 Week = c(1,2,3,4),
                 Calls = c(56,49,140,182,70,7,7,84,63,14,63,133,63,63,49,168,56,35,147,56))

Second dataset gives general Information of these 5 Customers.

df2 <- data.frame(CustID = c(1001,1002,1003,1004,1005),
                  Age = c(29,22,29,33,32),
                  Gender = c("F","M","F","F","M"))

Simple VLOOKUP Case :

Here we just VLOOKUP a single value. Mapping Age of a customer whose ID is 1001.

VLOOKUP(1001, df2, CustID, Age)
## [1] 29

So like Excel,

1001 : Value to map

df2 : Data form which the mapping is done

CustID : Key Value for mapping

Age : The Column that needs to be VLOOKUP

Now let’s vlookup complete column in a dataset. We can do this by using the mutate() function from dplyr. This works because VLOOKUP() is vectorized.

As mutate() function is used to create new column from a data set. Here we are creating “Age” column.

df1 %>%
  mutate(Age = VLOOKUP(CustID, df2, CustID, Age))
##    CustID Week Calls Age
## 1    1001    1    56  29
## 2    1001    2    49  29
## 3    1001    3   140  29
## 4    1001    4   182  29
## 5    1002    1    70  22
## 6    1002    2     7  22
## 7    1002    3     7  22
## 8    1002    4    84  22
## 9    1003    1    63  29
## 10   1003    2    14  29
## 11   1003    3    63  29
## 12   1003    4   133  29
## 13   1004    1    63  33
## 14   1004    2    63  33
## 15   1004    3    49  33
## 16   1004    4   168  33
## 17   1005    1    56  32
## 18   1005    2    35  32
## 19   1005    3   147  32
## 20   1005    4    56  32

Mapping more than 1 column:

df1 %>%
  mutate(Age = VLOOKUP(CustID, df2, CustID, Age)) %>%
  mutate(Gender = VLOOKUP(CustID, df2, CustID, Gender))
##    CustID Week Calls Age Gender
## 1    1001    1    56  29      F
## 2    1001    2    49  29      F
## 3    1001    3   140  29      F
## 4    1001    4   182  29      F
## 5    1002    1    70  22      M
## 6    1002    2     7  22      M
## 7    1002    3     7  22      M
## 8    1002    4    84  22      M
## 9    1003    1    63  29      F
## 10   1003    2    14  29      F
## 11   1003    3    63  29      F
## 12   1003    4   133  29      F
## 13   1004    1    63  33      F
## 14   1004    2    63  33      F
## 15   1004    3    49  33      F
## 16   1004    4   168  33      F
## 17   1005    1    56  32      M
## 18   1005    2    35  32      M
## 19   1005    3   147  32      M
## 20   1005    4    56  32      M