Date::Manip::Changes5to6 - describes differences between 5.xx and 6.00 |
Date::Manip::Changes5to6 - describes differences between 5.xx and 6.00
Date::Manip 6.00 represents a complete rethink and rewrite of Date::Manip. A great deal of effort was made to make sure that 6.00 is almost backwards compatible with 5.xx whenever feasible, but some functionality has changed in backwards incompatible ways. Other parts have been deprecated and will be removed at some point in the future.
This document describes the differences between the 5.xx series and version 6.00. This page primarily describes technical details, most of which do not impact how Date::Manip is used in scripts. If you want to make sure that a script which ran with 5.xx will run with 6.xx, refer to the Date::Manip::Migration5to6 document.
The Date::Manip 5.xx series of suffered from several weaknesses. These included:
The OO model allows a lot of information to be stored with each date (such as time zone information) which is discarded in the functional interface.
Date::Manip 6.00 is a complete rewrite of Date::Manip to address these and other issues.
The following sections address how Date::Manip 6.00 differs from previous releases, and describes changes that might need to be made to your script in order to upgrade from 5.xx to 6.00.
The most important changes are marked with asterisks.
A number of new modules have been created as well. These can be used directly, bypassing the main Date::Manip module. These include the following:
Date::Manip::Base contains many basic date operations which may be used to do simple date manipulation tasks without all the overhead of the full Date::Manip module.
Date::Manip::TZ contains time zone operations.
Handling dates, deltas, and recurrences are now done in Date::Manip::Date, Date::Manip::Delta, and Date::Manip::Recur.
All of these modules are object oriented, and are designed to be used directly, so if you prefer an OO interface over a functional interface, use these modules.
Some types of data depend on the config variables used, and these are cached separately, and this cache is automatically cleared every time a config variable is set. As a result, it is best if you set all config variables at the start, and then leave them alone completely to get optimal use of cached data.
A side effect of all this is that the Memoize module should not be used in conjunction with Date::Manip.
In the version 5.xx documentation, it was mentioned that the Memoize module might be used to improve performance in some cases. This is no longer the case. It should not be used with Date::Manip, even if you use the functional interface instead of the OO interface.
Ideally, this should never be necessary. If it is necessary, I'd like to hear about it so that I can add whatever standard methods are needed to the built in list.
However, I am interested in adding sets of time zones from various ``standards''.
Date::Manip 6.00 includes time zones from the following standards:
Olson zoneinfo database all Microsoft Windows time zones zones listed in RFC-822
If there are additional standards that include additional time zones not included here, please point me to them so they can be added. This could include published lists of time zone names supported on some operating system which have different names than the zoneinfo list.
As of 6.00, only time zones from standards will be included in the distribution (others can be added by users using the functions described in Date::Manip::TZ to add aliases for existing time zones).
The following time zones were in Date::Manip 5.xx but not in 6.00.
IDLW -1200 International Date Line West NT -1100 Nome SAT -0400 Chile CLDT -0300 Chile Daylight AT -0200 Azores MEWT +0100 Middle European Winter MEZ +0100 Middle European FWT +0100 French Winter GB +0100 GMT with daylight saving SWT +0100 Swedish Winter MESZ +0200 Middle European Summer FST +0200 French Summer METDST +0200 An alias for MEST used by HP-UX EETDST +0300 An alias for eest used by HP-UX EETEDT +0300 Eastern Europe, USSR Zone 1 BT +0300 Baghdad, USSR Zone 2 IT +0330 Iran ZP4 +0400 USSR Zone 3 ZP5 +0500 USSR Zone 4 IST +0530 Indian Standard ZP6 +0600 USSR Zone 5 AWST +0800 Australian Western Standard ROK +0900 Republic of Korea AEST +1000 Australian Eastern Standard ACDT +1030 Australian Central Daylight CADT +1030 Central Australian Daylight AEDT +1100 Australian Eastern Daylight EADT +1100 Eastern Australian Daylight NZT +1200 New Zealand IDLE +1200 International Date Line East
A series of modules are included which are auto-generated from the zoneinfo database. The Date::Manip::Zones, Date::Manip::TZ::*, and Date::Manip::Offset::* modules are all automatically generated and are not intended to be used directly. Instead, the Date::Manip::TZ module is used to access the data stored there.
A separate time zone module (Date::Manip::TZ::*) is included for every single time zone. There is also a module (Date::Manip::Offset::*) for every different offset. All told, there are almost 1000 modules. These are included to make time zone handling more efficient. Rather than calculating everything on the fly, information about each time zone and offset are included here which greatly speeds up the handling of time zones. These modules are only loaded as needed (i.e. only the modules related to the specific time zones you refer to are ever loaded), so there is no performance penalty to having them.
Also included in the distribution are a script (tzdata) and additional module (Date::Manip::TZdata). These are used to automatically generate the time zone modules, and are of no use to anyone other than the maintainer of Date::Manip. They are included solely for the sake of completeness. If someone wanted to fork Date::Manip, all the tools necessary to do so are included in the distribution.
Date::Manip 6.00 makes use of two different time zones: the actual local time zone the computer is running in (and which is used by the system clock), and a time zone that you want to work in. Typically, these are the same, but they do not have to be.
As of Date::Manip 6.00, the $::TZ and $ENV{TZ} variables are used only to specify the actual local time zone.
In order to specify an alternate time zone to work in, use the SetDate or ForceDate config variables.
Previously, variables passed in to Date_Init overrode values from config files. This has changed slightly. Options to Date_Init are now parsed in the order they are listed, so the following:
Date_Init("DateFormat=Other","ConfigFile=DateManip.cnf")
would first set the DateFormat variable, and then it would read the config file ``DateManip.cnf''. If that config file included a DateFormat definition, it would override the one passed in to Date_Init.
The proper way to override config files is to pass the config files in first, followed by any script-specific overrides. In other words:
Date_Init("ConfigFile=DateManip.cnf","DateFormat=Other")
GlobalCnf IgnoreGlobalCnf PersonalCnf PersonalCnfPath PathSep
All of these have been removed. Instead, the single config variable:
ConfigFile
will be used to specify config files (with no distinction between a global and personal config file). Also, no path searching is done. Each must be specified by a complete path. Finally, any number of config files can be used. So the following is valid:
Date_Init("ConfigFile=./Manip.cnf","ConfigFile=/tmp/Manip.cnf")
TodayIsMidnight Use DefaultTime instead.
ConvTZ Use SetDate or ForceDate instead.
Internal Use Printable instead.
DeltaSigns Use the Date::Manip::Delta::printf method to print deltas
UpdateCurrTZ With real time zone handling in place, this is no longer necessary
IntCharSet This has been replaced with better support for international character sets. The Encoding config variable may be used instead.
TZ Use SetDate or ForceDate instead.
Since it is now handles time change correctly (allowing time changes to occur in the alternate time zone), parsed results may be different than in 5.x (but since 5.x didn't have proper time zone handling, this is a good thing).
As of 6.00, these are treated strictly as date strings, so they are the current day, the day before, or the day after at the time 00:00:00.
The string ``now'' still refers to the current date and time.
Dates are now parsed according to the spec (though a couple extensions have been made, which are also documented in the Date::Manip::Date documentation).
There is one change with respect to Date::Manip 5.xx that results from a possible misinterpretation of the standard. In Date::Manip, there is a small amount of ambiguity in how the Www-D date formats are understood.
The date:
1996-w02-3
might be interpreted in two different ways. It could be interpreted as Wednesday (day 3) of the 2nd week of 1996, or as the 3rd day of the 2nd week of 1996 (which would be Tuesday if the week begins on Sunday). Since the specification only works with weeks which begin on day 1, the two are always equivalent in the specification, and the language of the specification doesn't clearly indicate one interpretation over the other.
Since Date::Manip supports the concept of weeks starting on days other than day 1 (Monday), the two interpretations are not equivalent.
In Date::Manip 5.xx, the date was interpreted as Wednesday of the 2nd week, but I now believe that the other interpretation (3rd day of the week) is the interpretation intended by the specification. In addition, if this interpretation is used, it is easy to get the other interpretation.
If 1996-w02-3 means the 3rd day of the 2nd week, then to get Wednesday (day 3) of the week, use the following two Date::Manip::Date methods:
$err = $date->parse("1996-w02-1"); $date2 = $date->next(3,1);
The first call gets the 1st day of the 2nd week, and the second call gets the next Wednesday.
If 1996-w02-3 is interpreted as Wednesday of the 2nd week, then to get the 3rd day of the week involves significantly more work.
In Date::Manip 6.00, the date will now be parsed as the 3rd day of the 2nd week.
This manifested itself it two ways. First, a lot of punctuation was ignored. For example, the string ``01 // 03 -. 75'' was the date 1975-01-03.
Second, a lot of word breaks were optional and it was often acceptable to run strings together. For example, the delta ``in5seconds'' would have worked.
With Date::Manip 6.00, parsing now tries to find a valid date in the string, but uses a more rigidly defined set of allowed formats which should more closely match how the dates would actually be expressed in real life. The punctuation allowed is more rigidly defined, and word breaks are required. So ``01/03/75'' will work, but ``01//03/75'' and ``01/03-75'' won't. Also, ``in5seconds'' will no longer work, though ``in 5 seconds'' will work.
These changes serve to simplify some of the regular expressions used in parsing dates, as well as simplifying the parsing routines. They also help to recognize actually dates as opposed to typos... it was too easy to pass in garbage and get a date out.
DD/YYmmm 03/09Jan DD/YYYYmmm 03/2009Jan mmmYYYY/DD Jan2009/03 YYYY/DDmmm 2009/03Jan
mmmYYYY Jan2009 YYYYmmm 2009Jan
The last two are no longer supported since they are incomplete.
With the exception of the incomplete forms, these could be added back in with very little effort. If there is ever a request to do so, I probably will.
DD/mmm/YYYY:HH:MN:SS
used in the apache logs. Due to the stricter parsing, this format is no longer supported directly. However, the parse_format method may be used to parse the date directly from an apache log line with no need to extract the date string beforehand.
The starting date is checked. If $timecheck was non-zero, the check failed if the date was not a business date, or if the time was not during business hours. If $timecheck was zero, the check failed if the date was not a business date, but the time was ignored.
In 5.xx, if the check failed, and $timecheck was non-zero, day 0 was defined as the start of the next business day, but if $timecheck was zero, day 0 was defined as the previous business day at the same time.
In 6.xx, if the check fails, and $timecheck is non-zero, the behavior is the same as before. If $timecheck is zero, day 0 is defined as the next business day at the same time.
So day 0 is now always the same, where before, day 0 meant two different things depending on whether $timecheck was zero or not.
So running a program on Jun 5, 2009 at noon that parsed the following dates gave the following return values:
Jun 12 => Jun 12, 2009 at 00:00:00 next week => Jun 12, 2009 at 12:00:00
This behavior is changed and now relies on the config variable DefaultTime. If DefaultTime is ``curr'', the default time for any date which includes no information about the time is the current time. Otherwise, the default time is midnight.
1:2:3
or in a language-specific expanded form:
1 hour 2 minutes 3 seconds
or in a mixed form:
1 hour 2:3
The mixed form has been dropped since I doubt that it sees much use in real life, and by dropping the mixed form, the parsing is much simpler.
Jan 10 1996 noon Jan 7 1998 noon
was +1:11:4:0:0:0:0 (or 1 year, 11 months, 4 weeks). As of Date::Manip 6.00, the delta is +2:0:-0:3:0:0:0 (or 2 years minus 3 days). Although this leads to mixed-sign deltas, it is actually how more people would think about the delta. It has the additional advantage of being MUCH easier and faster to calculate.
The old formats are described in the Date::Manip::DM5 manual, and the new ones are in the Date::Manip::Delta manual.
The new formats are much more flexible and I encourage you to switch over, however at this point, the old style formats are officially supported for the Delta_Format command.
At some point, the old style formats may be deprecated (and removed at some point beyond that), but for now, they are not.
The old formats are NOT available using the printf method.
In Date::Manip 5.xx, it could also refer to the nth day of the week (i.e. 1 being the 1st day of the week, -1 being the last day of the week). This meaning is no longer used in 6.xx.
For example, the recurrence:
1*2:3:4:0:0:0
referred to the 3rd occurrence of the 4th day of the week in February.
The meaning has been changed to refer to the 3rd occurrence of day 4 (Thursday) in February. This is a much more useful type of recurrence.
As a result of this change, the related recurrence:
1*2:3:-1:0:0:0
is invalid. Negative numbers may be used to refer to the nth day of the week, but NOT when referring to the day of week numbers.
This has been changed so that the dates may be on or before the end date.
every 2nd day in June [1997]
In actuality, this recurrence is not practical to calculate. It requires a base date which might imply June 1,3,5,... in 1997 but June 2,4,6 in 1998.
In addition, the recurrence does not fit the mold for other recurrences that are an approximate distance apart. This type of recurrence has a number of closely spaced events with 11-month gaps between groups.
I no longer consider this a valid recurrence and support is now dropped for this string.
I also dropped the following for a similar reason:
every 6th Tuesday [in 1999]
The Date::Manip module contains the same functions that Date::Manip 5.xx had (though the OO modules do all the work now). In general, the routines behave the same as before with the following exceptions:
Now, the Date_ConvTZ function only supports a 3 argument call:
$date = Date_ConvTZ($date,$from,$to);
If $from is not given, it defaults to the local time zone. If $to is not given, it defaults to the local time zone.
The optional 4th argument ($errlevel) is no longer supported. If there is an error, an empty string is returned.
Please refer to the the Date::Manip::Problems manpage documentation for information on submitting bug reports or questions to the author.
the Date::Manip manpage - main module documentation
This script is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Sullivan Beck (sbeck@cpan.org)
Date::Manip::Changes5to6 - describes differences between 5.xx and 6.00 |