Retrieving a date from a complex string in Python
我正在尝试使用datetime.strptime从两个字符串中获取单个datetime。
时间很容易(比如晚上8:53),所以我可以做如下的事情:
1 | theTime = datetime.strptime(givenTime,"%I:%M%p") |
然而,字符串不仅仅是一个日期,它是一个类似于
1 | theDate = datetime.strptime(givenURL,"http://site.com/?year=%Y&month=%m&day=%d&hour=%H") |
号
但我不想从链接中得到那个小时,因为它正在其他地方被检索。是否有一种方法可以放置一个虚拟符号(如%x或其他)作为最后一个变量的灵活空间?
最后,我设想有一条类似于:
1 | theDateTime = datetime.strptime(givenURL + givenTime,""http://site.com/?year=%Y&month=%m&day=%d&hour=%x%I:%M%p") |
(不过,显然,不会使用%x)。有什么想法吗?
如果您想简单地跳过URL中的时间,可以使用split,例如以下方法:
1 2 3 | givenURL = 'http://site.com/?year=2011&month=10&day=5&hour=11' pattern ="http://site.com/?year=%Y&month=%m&day=%d" theDate = datetime.strptime(givenURL.split('&hour=')[0], pattern) |
所以不确定你是否理解正确,但是:
1 2 3 4 5 | givenURL = 'http://site.com/?year=2011&month=10&day=5&hour=11' datePattern ="http://site.com/?year=%Y&month=%m&day=%d" timePattern ="&time=%I:%M%p" theDateTime = datetime.strptime(givenURL.split('&hour=')[0] + '&time=' givenTime, datePattern + timePattern) |
号
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | import datetime import re givenURL = 'http://site.com/?year=2011&month=10&day=5&hour=11' givenTime = '08:53PM' print ' givenURL == ' + givenURL print 'givenTime == ' + givenTime regx = re.compile('year=(\d\d\d\d)&month=(\d\d?)&day=(\d\d?)&hour=\d\d?') print ' map(int,regx.search(givenURL).groups()) ==',map(int,regx.search(givenURL).groups()) theDate = datetime.date(*map(int,regx.search(givenURL).groups())) theTime = datetime.datetime.strptime(givenTime,"%I:%M%p") print ' theDate ==',theDate,type(theDate) print ' theTime ==',theTime,type(theTime) theDateTime = theTime.replace(theDate.year,theDate.month,theDate.day) print ' theDateTime ==',theDateTime,type(theDateTime) |
结果
1 2 3 4 5 6 7 8 9 10 | givenURL == http://site.com/?year=2011&month=10&day=5&hour=11 givenTime == 08:53PM map(int,regx.search(givenURL).groups()) == [2011, 10, 5] theDate == 2011-10-05 <type 'datetime.date'> theTime == 1900-01-01 20:53:00 <type 'datetime.datetime'> theDateTime == 2011-10-05 20:53:00 <type 'datetime.datetime'> |
。编辑1
由于strptime()很慢,我改进了代码以消除它
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | from datetime import datetime import re from time import clock n = 10000 givenURL = 'http://site.com/?year=2011&month=10&day=5&hour=11' givenTime = '08:53AM' # eyquem regx = re.compile('year=(\d\d\d\d)&month=(\d\d?)&day=(\d\d?)&hour=\d\d? (\d\d?):(\d\d?)(PM|pm)?') t0 = clock() for i in xrange(n): given = givenURL + ' ' + givenTime mat = regx.search(given) grps = map(int,mat.group(1,2,3,4,5)) if mat.group(6): grps[3] += 12 # when it is PM/pm, the hour must be augmented with 12 theDateTime1 = datetime(*grps) print clock()-t0,"seconds eyquem's code" print theDateTime1 # Artsiom Rudzenka dateandtimePattern ="http://site.com/?year=%Y&month=%m&day=%d&time=%I:%M%p" t0 = clock() for i in xrange(n): theDateTime2 = datetime.strptime(givenURL.split('&hour=')[0] + '&time=' + givenTime, dateandtimePattern) print clock()-t0,"seconds Artsiom's code" print theDateTime2 print theDateTime1 == theDateTime2 |
结果
1 2 3 4 5 6 7 | 0.460598763251 seconds eyquem's code 2011-10-05 08:53:00 2.10386180366 seconds Artsiom's code 2011-10-05 08:53:00 True |
。
我的代码快了4.5倍。如果有很多这样的转换要执行,这可能会很有趣。
使用格式字符串是不可能做到这一点的。但是,如果时间无关紧要,您可以像在第一个示例中那样从URL获取时间,然后调用
这样就不需要进行任何额外的解析。