Option 2
Using pandas.Series.str.slice
as follows
df['col_substring'] = df['col'].str.slice(0, 4)[Out]:col col_substring0 2020-12-08 20201 2020-12-08 20202 2020-12-08 20203 2020-12-08 20204 2020-12-08 20205 2020-12-08 20206 2020-12-08 20207 2020-12-08 20208 2020-12-08 20209 2020-12-08 2020
or like this
df['col_substring'] = df['col'].str.slice(stop=4)
Option 3
Using a custom lambda function
df['col_substring'] = df['col'].apply(lambda x: x[:4])[Out]:col col_substring0 2020-12-08 20201 2020-12-08 20202 2020-12-08 20203 2020-12-08 20204 2020-12-08 20205 2020-12-08 20206 2020-12-08 20207 2020-12-08 20208 2020-12-08 20209 2020-12-08 2020
Option 4
Using a custom lambda function with a regular expression (with re
)
import redf['col_substring'] = df['col'].apply(lambda x: re.findall(r'^.{4}', x)[0])[Out]:col col_substring0 2020-12-08 20201 2020-12-08 20202 2020-12-08 20203 2020-12-08 20204 2020-12-08 20205 2020-12-08 20206 2020-12-08 20207 2020-12-08 20208 2020-12-08 20209 2020-12-08 2020
Option 5
Using numpy.vectorize
df['col_substring'] = np.vectorize(lambda x: x[:4])(df['col'])[Out]:col col_substring0 2020-12-08 20201 2020-12-08 20202 2020-12-08 20203 2020-12-08 20204 2020-12-08 20205 2020-12-08 20206 2020-12-08 20207 2020-12-08 20208 2020-12-08 20209 2020-12-08 2020
Note: