HEADER MERGING Modified Aug 20 The problem we face is how to merge files that are mostly compatible but may have slightly different header keys. In the ASC pipeline we want to be very strict about controlling which header keywords appear in the result with particular values, while in generic applications we want to leave much more freedom. To support this flexibility we introduce the idea of a 'rules file' which gives a set of rules for specific keywords. The format of the file is that each line contains a key name followed by a set of rules; the special key name '*' sets a default rule to be followed by any keyword not present in the rest of the list. The proposed routine extern void dmMergeHeader( dmBlock** input, long nblocks, dmBlock* output, char* rules_file ) would use this proposed rules file for header merging. The supported rules are as follows: Delete: The key should not appear in the merged header even if it is present in any or all of the input headers. WarnFirst: Pick value from first header, doesn't matter about later headers. Warning message if not present or if different; pick first. WarnPrefer : Like WarnFirst, but use the given value if it is present in any one of the headers. Failing that, use the value from the first header. WarnOmit: Warn and omit if any of the values are different WarnOmit : Warn and omit if different more than tol value, else take first. For example, we may want to allow a tolerance (slop) of a few microns on the SIM position, but omit it (making it undefined) if the SIM position varies by more than that. Force: Create a default (null, 0, blank) value if not present. Warning message if not same in all input headers (so if it's always absent, no message needed, and if always present with the same value, no message needed). Force : Create with given value if not present. Calc: Calc with special hard coded rules. If not present, omit If no hard coded rule for this keyword, it's an error. The hard coded rules are listed later in this document. CalcForce: Like Calc, but if not present, force creation if possible. Default : If not present, assume default value given. Used in combination with other rules. Thus, TIMEUNIT Fail;Default s means that if TIMEUNIT is missing from any of the input headers, pretend it is present with the value 's', and then apply the 'Fail' rule, which then implies that if TIMEUNIT is 's' in all the rest of the files, it can be missing from some of them, but if TIMEUNIT is 'd' in one of the files, it had better be present with value 'd' in all of them. Fail: Fail if any values different: give an error message and return an error value to the calling routine. Omit the keyword in the merged header. Fail : Fail if any values are more different than tol. Min: Take min of those present, without warning. Max: Take max of those present, without warning. Match: Warning message if not same in all headers; pick first but don't warn if not present Merge : If keyword is present and values are different, use value instead of any of the individual values, and warn. Thus MISSION Merge Merged; Force AXAF means: if MISSION = 'ROSAT' in all headers, that's fine: MISSION = 'ROSAT' in the output if MISSION = 'AXAF' in some and MISSION is missing in others, then MISSION = 'AXAF' in the output (because of Force AXAF) if MISSION = 'ROSAT' in one and MISSION = 'Einstein' in another, then MISSION = 'Merged' in the output (because of Merge Merged) if MISSION = 'ROSAT' in one and MISSION is missing in others, then MISSION = 'Merged' in the output (since Force AXAF implies that being missing is equivalent to being equal to AXAF) This allows us to warn the user if an unexpected combination of missions is being used. In addition to these rules, we must handle the cases of mismatching data type, comment and unit. In all these cases, the proposed solution is to take the first data type, comment and unit and discard the rest. Mismatch of data type will be considered a serious error and will generate an error message and set an error value in the routine. Mismatch of comment and/or unit will be considered a trivial error and will be silently ignored. The proposed AXAF pipeline/data analysis merge rules file is: * WarnOmit ORIGIN WarnFirst; Force ASC CREATOR WarnFirst; Force dmappend CHECKSUM CalcForce DATASUM CalcForce DATE CalcForce CONTENT Force HDUNAME Force MISSION Merge Merged; Force AXAF TELESCOP Merge Merged; Force Unknown OBJECT Merge Merged; Force Unknown RA_NOM WarnOmit 0.0003 DEC_NOM WarnOmit 0.0003 EQUINOX WarnPrefer 2000.0 RADECSYS WarnPrefer ICRS DATACLAS WarnPrefer Observed HDUSPEC Warn HDUDOC Warn HDUVERS Warn HDUCLASS Warn HDUCLAS1 Warn HDUCLAS2 Warn HDUCLAS3 Warn TSTART CalcForce TSTOP CalcForce DATE-OBS CalcForce DATE-END CalcForce TIMESYS Fail;Default UTC MJDREF Fail MJDREFI Fail MJDREFF Fail TIMEUNIT Fail;Default s TIMVERSN WarnFirst TIMEPIXR Fail;Default 0.5 TIMEDEL Max TIMEZERO Calc CLOCKAPP WarnFirst TIERRELA Max TIERABSO Max OBSERVER Merge Merged; Force Unknown TITLE Merge Merged; Force Unknown OBS_ID Merge Merged; Force Unknown INSTRUME Merge Merged; Force Unknown DETNAM Merge Merged; Force Unknown GRATING Merge Merged; Force Unknown OBS_MODE Merge Merged; Force Unknown DATAMODE Merge Merged; Force Unknown SIM_X WarnOmit 0.001 SIM_Y WarnOmit 0.001 SIM_Z WarnOmit 0.001 FOC_LEN WarnOmit 1.0 ONTIME CalcForce LIVETIME CalcForce EXPOSURE CalcForce DTCOR CalcForce ROLL_NOM WarnOmit 1.0 Alternative rules files are, first: Non-SI definition: * WarnFirst OBJECT WarnOmit RA_NOM WarnOmit 0.0003 DEC_NOM WarnOmit 0.0003 EQUINOX WarnPrefer 2000.0 RADECSYS WarnPrefer ICRS DATACLAS WarnPrefer Observed MISSION Merge Merged; Force AXAF TELESCOP Merge Merged; Force Unknown INSTRUME Merge Merged; Force Unknown and next, a Generic (DM) definition, which assumes nothing about AXAF specific conventions. This should be the general default rules file. It contains only the basic time and pointing direction related keywords. * WarnFirst DATE Calc TSTART Calc TSTOP Calc DATE-OBS Calc DATE-END Calc ONTIME Calc LIVETIME Calc EXPOSURE Calc DTCOR Calc TELESCOP Merge Merged; Force Unknown OBJECT Merge Merged; Force Unknown RA_NOM WarnOmit 0.0003 DEC_NOM WarnOmit 0.0003 EQUINOX WarnPrefer 2000.0 RADECSYS WarnPrefer ICRS INSTRUME Merge Merged; Force Unknown DETNAM Merge Merged; Force Unknown Adam and Joel's Level 2 rules: ORIGIN WarnFirst, Force ASC CREATOR WarnFirst, Force CHECKSUM CalcForce DATASUM CalcForce DATE CalcForce DATE-OBS CalcForce DATE-END CalcForce TIMEUNIT Fail;Default s TIMEPIXR Fail;Default 0.5 TIMEDEL Max MISSION Merge Merged; Force AXAF TELESCOP Merge Merged; Force Unknown INSTRUME Fail(*) DETNAM Fail(*) GRATING Fail(*) OBJECT Merge Merged; Force Unknown TITLE Merge Merged; Force Unknown OBSERVER Merge Merged; Force Unknown OBS_ID Merge Merged; Force Unknown SIM_X WarnOmit 0.001 SIM_Y WarnOmit 0.001 SIM_Z WarnOmit 0.001 OBS_MODE Fail(*) RA_NOM WarnOmit 0.0003 DEC_NOM WarnOmit 0.0003 ROLL_NOM WarnOmit 1.0 EQUINOX WarnPrefer 2000.0 RADECSYS WarnPrefer ICRS DATACLAS WarnPrefer Observed ONTIME CalcForce(*) LIVETIME CalcForce(*) DTCOR CalcForce(*) STARTMJF Delete STARTMNF Delete STARTOBT Delete STOPMJF Delete STOPMNF Delete TLM_FMT Delete BPIXFILn Delete DEGAP_X1 Delete DEGAP_X2 Delete DEGAP_Y1 Delete DEGAP_Y2 Delete SHYPLUS WarnOmit SHYMINUS WarnOmit BPIXFILE WarnOmit ADCCORF WarnOmit GAINCORF WarnOmit DEGAP WarnOmit GAINFILE WarnOmit ACIS-specific keywords: GRD_FILE Fail GRD_SCHM Fail ONTIMEn CalcForce LIVTIMEn CalcForce (n=0,...,9, as necessary) The issue of ACIS-specific keywords makes me think we may want to have an "if" rule: GRD_FILE If (INSTRUME='ACIS') Fail; Else WarnOmit or GRD_FILE Fail If INSTRUME='ACIS'; WarnOmit However, it seems like this could wait for a later rev. The ONTIMEn keywords may need to be handled specially. ---------------------------------------- Hard coded rules for Calc: Rule 1: TSTART, TSTOP, TIMEZERO TIMEZERO is set to zero on output Need to handle scaling of TIME column internally in DM (TIMEZERO is equivalent to a TZEROn on the TIME column) TSTART and TSTOP are set to the first TSTART and the last TSTOP. Rules for making from scratch: - check in obs.par in calling program - take ends of data subspace interval - for binned data, take first bin time, last bin time + binsize adjusted for TIMEPIXR TIMEPIXR is 0.0: times are start of bin 0.5: times are middle of bin (default) 1.0: times are end of bin Rule 2: ONTIME, DTCOR, LIVETIME, EXPOSURE ONTIME is sum of TIME DSS DTCOR = sum(LIVETIMEi) / ONTIME LIVETIME = ONTIME * DTCOR EXPOSURE = new LIVETIME * sum(old EXPOSUREi)/sum(old LIVETIMEi) Rule 3: MJDOBS, DATE-OBS, DATE-END MJDOBS is the MJD corresponding to TSTART + TIMEZERO DATE-OBS is the FITS format date string corresponding to TSTART + TIMEZERO DATE-END is the FITS format date string corresponding to TSTOP + TIMEZERO Rule 4: CHECKSUM: CHECKSUM should be automatically recalculated if present by the DM block close call, using the code provided by A. Rots or the call in CFITSIO. This will be done automatically by the data model at the time of block close, so nothing special need be done by hdrlib. Notes on how to merge the headers: I'm imagining we'll read all the input blocks, and then do for each key name { a set of dmKeyReads, a merge, a dmKeyWrite }, rather than caching the values of all the keys and then writing them all out later. I'm also imagining it'll be implemented on top of the current dm interface. However, if it were done at a lower level (using private dm functions) it might make more sense to cache an entire merged header and write it out at once. This routine would need to: Read the rules file Do dmGetKeyList for each input block Iterate through the blocks Iterate through the keys in each block For each key not previously found, - find its counterparts in the other blocks. - Mark them as found - find the rule for this key - mark the rule as used - merge according to the rule - write the merged key Then, for each rule that hasn't been used yet and is of type 'Force', write the corresponding required key. This will generate a merged header with: keys from the first input block in the same order in the output followed by keys from later blocks not present in the first block followed by 'forced' keys not present in any of the blocks. We may want to add 'after NAME' rules to enforce positioning of the merged keys relative to one another, like for dmhedit. This can probably wait for a second rev. - Jonathan