Iterating through a range of dates in Python
我有下面的代码来完成这项工作,但是我怎样才能做得更好呢?现在我认为它比嵌套循环要好,但是当您在列表理解中有一个生成器时,它开始获得PerlOneLinerish。
1 2 3 | day_count = (end_date - start_date).days + 1 for single_date in [d for d in (start_date + timedelta(n) for n in range(day_count)) if d <= end_date]: print strftime("%Y-%m-%d", single_date.timetuple()) |
笔记
- 我不是真的用这个来打印。这只是为了演示。
start_date 和end_date 变量是datetime.date 对象,因为我不需要时间戳。(它们将用于生成报告)。
样品输出
对于
1 2 3 4 5 6 7 8 9 10 11 | 2009-05-30 2009-05-31 2009-06-01 2009-06-02 2009-06-03 2009-06-04 2009-06-05 2009-06-06 2009-06-07 2009-06-08 2009-06-09 |
号
为什么有两个嵌套迭代?对于我来说,它只需要一次迭代就可以生成相同的数据列表:
1 2 | for single_date in (start_date + timedelta(n) for n in range(day_count)): print ... |
并且没有存储列表,只迭代一个生成器。此外,发电机中的"if"似乎是不必要的。
毕竟,一个线性序列应该只需要一个迭代器,而不是两个。
与John Machin讨论后更新:也许最优雅的解决方案是使用生成器函数完全隐藏/抽象日期范围内的迭代:
1 2 3 4 5 6 7 8 9 10 | from datetime import timedelta, date def daterange(start_date, end_date): for n in range(int ((end_date - start_date).days)): yield start_date + timedelta(n) start_date = date(2013, 1, 1) end_date = date(2015, 6, 2) for single_date in daterange(start_date, end_date): print single_date.strftime("%Y-%m-%d") |
号
注意:为了与内置的
这可能更清楚:
1 2 3 4 5 | d = start_date delta = datetime.timedelta(days=1) while d <= end_date: print d.strftime("%Y-%m-%d") d += delta |
使用
1 2 3 4 5 6 7 8 | from datetime import date from dateutil.rrule import rrule, DAILY a = date(2009, 5, 30) b = date(2009, 6, 9) for dt in rrule(DAILY, dtstart=a, until=b): print dt.strftime("%Y-%m-%d") |
这个python库有许多更高级的特性,一些非常有用的特性,比如
大熊猫是一个伟大的时间序列一般,并直接支持日期范围。
1 2 | import pandas as pd daterange = pd.date_range(start_date, end_date) |
号
然后可以循环显示日期范围以打印日期:
1 2 | for single_date in daterange: print (single_date.strftime("%Y-%m-%d")) |
号
它还有很多让生活更轻松的选择。例如,如果您只想要工作日,您只需要在bdate_范围内进行交换。参见http://pandas.pydata.org/pandas docs/stable/timeseries.html生成时间戳的范围
熊猫的力量实际上是它的数据帧,它支持矢量化操作(很像numpy),使大量数据的操作变得非常快速和简单。
编辑:您还可以完全跳过for循环,直接打印它,这样更简单、更高效:
1 | print(daterange) |
号
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | import datetime def daterange(start, stop, step=datetime.timedelta(days=1), inclusive=False): # inclusive=False to behave like range by default if step.days > 0: while start < stop: yield start start = start + step # not +=! don't modify object passed in if it's mutable # since this function is not restricted to # only types from datetime module elif step.days < 0: while start > stop: yield start start = start + step if inclusive and start == stop: yield start # ... for date in daterange(start_date, end_date, inclusive=True): print strftime("%Y-%m-%d", date.timetuple()) |
。
这个函数的功能比您严格要求的要多,它支持负步骤等。只要您考虑超出范围逻辑,那么您就不需要单独的
为什么不尝试:
1 2 3 4 5 6 7 8 9 10 | import datetime as dt start_date = dt.datetime(2012, 12,1) end_date = dt.datetime(2012, 12,5) total_days = (end_date - start_date).days + 1 #inclusive 5 days for day_number in range(total_days): current_date = (start_date + dt.timedelta(days = day_number)).date() print current_date |
。
这是我能想到的最人性化的解决方案。
1 2 3 4 5 6 7 | import datetime def daterange(start, end, step=datetime.timedelta(1)): curr = start while curr < end: yield curr curr += step |
号
显示从今天起的最后n天:
1 2 3 | import datetime for i in range(0, 100): print((datetime.date.today() + datetime.timedelta(i)).isoformat()) |
号
输出:
1 2 3 4 5 6 | 2016-06-29 2016-06-30 2016-07-01 2016-07-02 2016-07-03 2016-07-04 |
号
numpy的
1 2 3 4 5 6 | import numpy as np from datetime import datetime, timedelta d0 = datetime(2009, 1,1) d1 = datetime(2010, 1,1) dt = timedelta(days = 1) dates = np.arange(d0, d1, dt).astype(datetime) |
号
使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | import datetime def daterange(start, stop, step_days=1): current = start step = datetime.timedelta(step_days) if step_days > 0: while current < stop: yield current current += step elif step_days < 0: while current > stop: yield current current += step else: raise ValueError("daterange() step_days argument must not be zero") if __name__ =="__main__": from pprint import pprint as pp lo = datetime.date(2008, 12, 27) hi = datetime.date(2009, 1, 5) pp(list(daterange(lo, hi))) pp(list(daterange(hi, lo, -1))) pp(list(daterange(lo, hi, 7))) pp(list(daterange(hi, lo, -7))) assert not list(daterange(lo, hi, -1)) assert not list(daterange(hi, lo)) assert not list(daterange(lo, hi, -7)) assert not list(daterange(hi, lo, 7)) |
。
我也有类似的问题,但是我需要每月迭代,而不是每天迭代。
这是我的解决方案
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | import calendar from datetime import datetime, timedelta def days_in_month(dt): return calendar.monthrange(dt.year, dt.month)[1] def monthly_range(dt_start, dt_end): forward = dt_end >= dt_start finish = False dt = dt_start while not finish: yield dt.date() if forward: days = days_in_month(dt) dt = dt + timedelta(days=days) finish = dt > dt_end else: _tmp_dt = dt.replace(day=1) - timedelta(days=1) dt = (_tmp_dt.replace(day=dt.day)) finish = dt < dt_end |
示例1
1 2 3 4 5 | date_start = datetime(2016, 6, 1) date_end = datetime(2017, 1, 1) for p in monthly_range(date_start, date_end): print(p) |
。
产量
1 2 3 4 5 6 7 8 | 2016-06-01 2016-07-01 2016-08-01 2016-09-01 2016-10-01 2016-11-01 2016-12-01 2017-01-01 |
号
示例2
1 2 3 4 5 | date_start = datetime(2017, 1, 1) date_end = datetime(2016, 6, 1) for p in monthly_range(date_start, date_end): print(p) |
号
产量
1 2 3 4 5 6 7 8 | 2017-01-01 2016-12-01 2016-11-01 2016-10-01 2016-09-01 2016-08-01 2016-07-01 2016-06-01 |
can't*believe this question has existed for 9 years without anyone suggesting a simple recursive function:
1 2 3 4 5 6 7 8 9 10 11 12 13 | from datetime import datetime, timedelta def walk_days(start_date, end_date): if start_date <= end_date: print(start_date.strftime("%Y-%m-%d")) next_date = start_date + timedelta(days=1) walk_days(next_date, end_date) #demo start_date = datetime(2009, 5, 30) end_date = datetime(2009, 6, 9) walk_days(start_date, end_date) |
输出:
1 2 3 4 5 6 7 8 9 10 11 | 2009-05-30 2009-05-31 2009-06-01 2009-06-02 2009-06-03 2009-06-04 2009-06-05 2009-06-06 2009-06-07 2009-06-08 2009-06-09 |
编辑:*现在我可以相信了——看到python优化了tail递归吗?谢谢你,蒂姆。
1 2 | for i in range(16): print datetime.date.today() + datetime.timedelta(days=i) |
。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | > pip install DateTimeRange from datetimerange import DateTimeRange def dateRange(start, end, step): rangeList = [] time_range = DateTimeRange(start, end) for value in time_range.range(datetime.timedelta(days=step)): rangeList.append(value.strftime('%m/%d/%Y')) return rangeList dateRange("2018-09-07","2018-12-25", 7) Out[92]: ['09/07/2018', '09/14/2018', '09/21/2018', '09/28/2018', '10/05/2018', '10/12/2018', '10/19/2018', '10/26/2018', '11/02/2018', '11/09/2018', '11/16/2018', '11/23/2018', '11/30/2018', '12/07/2018', '12/14/2018', '12/21/2018'] |
号
您可以简单而可靠地使用熊猫库在两个日期之间生成一系列日期。
1 2 3 | import pandas as pd print pd.date_range(start='1/1/2010', end='1/08/2018', freq='M') |
您可以通过将freq设置为d、m、q、y来更改生成日期的频率。(每日、每月、每季度、每年)
通过将
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | def date_range(start, stop, step=1, inclusive=False): day_count = (stop - start).days if inclusive: day_count += 1 if step > 0: range_args = (0, day_count, step) elif step < 0: range_args = (day_count - 1, -1, step) else: raise ValueError("date_range(): step arg must be non-zero") for i in range(*range_args): yield start + timedelta(days=i) |
号
下面是通用日期范围函数的代码,类似于ber的答案,但更灵活:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | def count_timedelta(delta, step, seconds_in_interval): """Helper function for iterate. Finds the number of intervals in the timedelta.""" return int(delta.total_seconds() / (seconds_in_interval * step)) def range_dt(start, end, step=1, interval='day'): """Iterate over datetimes or dates, similar to builtin range.""" intervals = functools.partial(count_timedelta, (end - start), step) if interval == 'week': for i in range(intervals(3600 * 24 * 7)): yield start + datetime.timedelta(weeks=i) * step elif interval == 'day': for i in range(intervals(3600 * 24)): yield start + datetime.timedelta(days=i) * step elif interval == 'hour': for i in range(intervals(3600)): yield start + datetime.timedelta(hours=i) * step elif interval == 'minute': for i in range(intervals(60)): yield start + datetime.timedelta(minutes=i) * step elif interval == 'second': for i in range(intervals(1)): yield start + datetime.timedelta(seconds=i) * step elif interval == 'millisecond': for i in range(intervals(1 / 1000)): yield start + datetime.timedelta(milliseconds=i) * step elif interval == 'microsecond': for i in range(intervals(1e-6)): yield start + datetime.timedelta(microseconds=i) * step else: raise AttributeError("Interval must be 'week', 'day', 'hour' 'second', \ 'microsecond' or 'millisecond'.") |
号
对于按天递增的范围,下面的内容如何处理:
1 2 | for d in map( lambda x: startDate+datetime.timedelta(days=x), xrange( (stopDate-startDate).days ) ): # Do stuff here |
- startdate和stopdate是datetime.date对象
对于一般版本:
1 2 | for d in map( lambda x: startTime+x*stepTime, xrange( (stopTime-startTime).total_seconds() / stepTime.total_seconds() ) ): # Do stuff here |
。
- StartTime和StopTime是datetime.date或datetime.datetime对象(两者应为同一类型)
- StepTime是TimeDelta对象
注意,只有在python 2.7之后才支持.total_seconds(),如果您坚持使用早期版本,则可以编写自己的函数:
1 2 | def total_seconds( td ): return float(td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6) / 10**6 |
此功能具有一些额外功能:
- 无法传递与开始或结束日期格式匹配的字符串,该字符串将转换为日期对象
- 可以为开始或结束传递日期对象
如果结束时间早于开始时间,则检查错误
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31import datetime
from datetime import timedelta
DATE_FORMAT = '%Y/%m/%d'
def daterange(start, end):
def convert(date):
try:
date = datetime.datetime.strptime(date, DATE_FORMAT)
return date.date()
except TypeError:
return date
def get_date(n):
return datetime.datetime.strftime(convert(start) + timedelta(days=n), DATE_FORMAT)
days = (convert(end) - convert(start)).days
if days <= 0:
raise ValueError('The start date must be before the end date.')
for n in range(0, days):
yield get_date(n)
start = '2014/12/1'
end = '2014/12/31'
print list(daterange(start, end))
start_ = datetime.date.today()
end = '2015/12/1'
print list(daterange(start, end))号