本文共 3102 字,大约阅读时间需要 10 分钟。
pandas/numpy/datetime/time,这四个module是常用的时间相关模块。timestamp
,datetime
,str
是三大类常用的数据类型。需要理顺彼此之间错综复杂的关系。
The Python world has a number of avaiable representations of dates, times, deltas, and timespans.
datetime
and dateutil
Python’s basic objects for working with dates and times reside in the built-in datetime
module.
Third-party dateutil
can be used to parse dates from a variety of string formats.
The datetime
module supplies classes for manipulating dates and times.
The dateutil
module provides powerful extensions to the standard datetime
module.
Numpy
's datetime64The weaknesses of Python’s datetime format inspired the Numpy team to add a set of native time series date type to Numpy.
The datetime64
dtype encodes dates as 64-bit integers, and thus allows arrays of dates to be represented very compactly.
The datetime64
requires a very specific input format.
Because of the uniform type in NumPy datetime64
arrays, this type of operation can be accomplished much more quickly than if we were working directly with Python’s datetime
objects.
Starting in NumPy 1.7, there are core array date types which natively support datetime functionality. The data type is called “datetime64
”, so named because “datetime” is already taken by datetime
library included in Python.
The most basic way to create datetimes
is from strings in ISO8601 date or datetime format.
The Unit for internal storage is :
Y
M
W
D
h
m
s
ms
us
ns
ps
fs
as
datetime64
is the data type; datetime64[ns]
or datetime64[s]
or datetime64[unit]
is datetime64
with unit.
Finally, we will note that while the
datetime64
data type addresses some of the deficiencies of the built-in Pythondatetime
type, it lacks many of the convenient methods and functions provided bydatetime
and especiallydateutil
.
pandas
: best of both worldsPandas
builds upon all the tools just discussed to provide Timestamp
object, which combines the ease-of-use of datetime
and dateutil
with the efficient storage and vectorized interface of numpy.datetime64
.
From a group of these Timestamp
objects, Pandas can construct a DatetimeIndex
that can be used to index data in a Series or DataFrame.
Where the Pandas time series tools become useful is when you begin to index data by timestamps.
For timestamps, Pandas provides the Timestamp
type: it is essentially a replacement for Python’s native datetime
, but is based on the more efficient numpy.datetime64
date type.
For time Periods, Pandas provides the Period
type, based on numpy.datetime64
.
For time deltas or durations, Pandas provides the Timedelta
type, based on numpy.timedelta64
, more efficient replacement for Python’s native datetime.timedelta
type.
Python native is datetime.datetime
data type from module: datetime
;
更高效的是datetime64
data type from module: NumPy
;
结合上述两者优点的是TimeStamp
/ Timedelta
data type from module: Pandas
;
转载地址:http://vtge.baihongyu.com/