Extracting Year from Dates in R

 

This is the data from which we will try to extract year values.

       date1                date2
1 24-05-2023 19-01-2018  03:14:07
2 10-07-2020 07-12-1998  22:07:56
3 01-11-1987 24-05-2023  12:47:15
4 20-04-2007 18-09-2021  10:35:15

Let us explore a few different ways of extracting the year from dates in R.

 

1. Using ‘format()’ function

First, the required date columns must be converted to date format as R often reads dates as character values. This is done by using as.Date or as.POSIXct, along with specifying their format, like dd-mm-yyyy, and so on.

date$date1<-as.Date(date$date1, format="%d-%m-%Y")
date$date2<-as.POSIXct(date$date2, format="%d-%m-%Y %H:%M:%S")

Then, for extracting year values, we simply use “%Y” as the format specifier.

format(date$date1, "%Y")
[1] "2023" "2020" "1987" "2007"
format(date$date2, "%Y")
[1] "2018" "1998" "2023" "2021"

 

2. Using ‘substring()’ function

This function can be used if the exact starting and ending positions of the year values within the date string are known and fixed in the data.

If the format of date value is dd-mm-yyyy, then the following command can be used:

substring(date$date1,7,10)
[1] "2023" "2020" "1987" "2007"

If your date is in a different format, you would need to adjust the starting and ending positions accordingly in the command – substring(x, first, last)

 

3. Using ‘lubridate’ package

Here, too, first the date values must be converted into date format, and based on their format, different functions can be used, like dmy() for dd-mm-yyyy, or mdy() for mm-dd-yyyy, etc. After which, year() function can be used for automatically extracting the years from “Date” class objects.

library(lubridate)
date$date2<-dmy_hms(date$date2)
year(date$date2)
[1] 2018 1998 2023 2021

 

4. Using ‘date’ package

Like the steps followed in the previous example, first the data must be converted to date format, only then can the years be extracted. Here, the class of the date column becomes “date” upon using the as.date() function.

library(date)
date$date1<-as.date(date$date1, order = "dmy")
date.mdy(date$date1)$year
[1] 2023 2020 1987 2007

 

5. Using ‘data.table’ package

The as.IDate() function converts the “date” column to the IDate class, which is a fast and memory-efficient representation of dates in data.table. Following this, year() function is used. This can be done as follows:

library(data.table)
year(as.IDate(date$date1, '%d-%m-%Y'))
[1] 2023 2020 1987 2007
year(as.IDate(date$date2, '%d-%m-%Y %H:%M:%S'))
[1] 2018 1998 2023 2021