Posted by: genezeien | January 15, 2010

Monthly gridded data in easy to use format

Monthy_1x1.txt 34Mb file. First column is date i.e. year+(month-1)/12  So, January of 1932 comes out as 1932.00, and December of 1932 is 1932.92.  Remaining columns have latitude,longitude in the first row, and temperatures in Celsius thereafter. Columns are separate by a space.

Quality control:

  1. If a 1×1 degree sector has only one day’s worth of data, that is the temperature for that month: (TMAX+TMIN)/2.  The intent is to avoid discarding data from sites that are subject to extreme weather events, or observer hardships.  GISS quality control method is to discard a whole month for a site when a single day is missing, then “fill in” from surrounding stations, up to 1200km away.  I believe the GISS method favors sites which are easy to observe.
  2. Daily minimum of maximum temperatures above 100C or below -100C were discarded.  GHCN uses -9999 as a key for missing values, and occasionally a -999 would sneak in.  In reality, the allowed range could be tighter to weed out any other irrational temperatures.
#!/bin/bash

# Monthly data
#wget -Nr ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v2 -o monthly.log
#find ftp.ncdc.noaa.gov/pub/data/ghcn -name \*.Z -exec uncompress '{}' \;

# Daily data
#wget -Nr --exclude-directories=grid ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily -o daily.log
H=`pwd`
cd ftp.ncdc.noaa.gov/pub/data/ghcn/daily

# Start with list of stations' coordinates & codes
# ghcnd-stations.txt:   List of stations and their metadata (e.g., coordinates)
# Contents look like:
# ID           Lat       Long      Elev   State     Name    GSNflag  HCNflag     WMOid
# AJ000037749  40.7000   47.7000   95.0   GEOKCAY                                37749

# Look up min & max temps from ${StationID}.dly files
# ex: "AJ000037749.dly"  line 1 looks like:
#            YearMo      Day1    Day2    Day3    ...
# AJ000037749193602PRCP    0  I   17  I    0  I    0  I    0  I    0  I    0  I   78  I    0  I    0  I    0  I    0  I    0  I   20  I    0  I    0  I    0  I    0  I    0  I    0  I    0  I   12  I    0  I    0  I    0  I    0  I    0  I    0  I    0  I-9999   -9999
# AJ000037749200910TMAX  240  S  224  S  252  S  256  S  250  S  250  S  239  S  212  S  220  S  221  S  225  S  239  S-9999     253  S  256  S  231  S  239  S  240  S  230  S  231  S  229  S  230  S  231  S-9999     243  S-9999     181  S  187  S  212  S  170  S  173  S

# List of stations with temperature data.
# This takes some time.  Delete files to freshen the lists.
if [ ! -e ${H}/TempStations.txt ] ; then
  awk '(F!=FILENAME)&&($1~/TMAX/){F=FILENAME; print F > "'${H}/TempStations.txt'"}' all/*.dly
fi

# Which stations are in each sector?  Format:  Lat Lon StationID
# where Lat is rounded to the nearest integer.
awk 'FILENAME~/ghcn/{\
      Lat[$1]=int($2+90.5); Lon[$1]=int($3+180.5);\
     }\
     FILENAME~/TempSta/{\
      ID=substr($1,5,11);\
      Station[ID]=1;\
     }\
     END{\
      for(ID in Station)print Lat[ID]-90,Lon[ID]-180,ID;\
     }' ${H}/TempStations.txt ghcnd-stations.txt > ${H}/LatLonIDs.txt

#################################################################################
# Calculate daily average for each sector.
# Calculate monthly from dailies for each sector.
echo -n "" > ${H}/Daily_1x1.txt
for Coord in `awk '{print $1","$2}' ${H}/LatLonIDs.txt | sort -u`; do
   # remove the comma between Lat & Lon
   C="${Coord/,/ }"
   echo $C
   # list of stations with these coordinates
   Tstations=(`grep ^"$C" ${H}/LatLonIDs.txt | awk '{print "all/"$3".dly"}'`)
         awk -v Coord="$Coord" '{St=substr($0,0,11);Yr=substr($0,12,4);Mo=substr($0,16,2);}\
           $1~/TMAX/{l=length($0); d=1; \
               for(i=22;i<l;i+=8){\
                  mx=substr($0,i,5)/10.0;\
                  if((mx<100)&&(mx>(-100))){\
                     tmx[Yr,Mo,d]+=mx; Nx[Yr,Mo,d]++;\
                     if(D[Yr,Mo,d]==""){\
                        D[Yr]++; Day[Yr,Mo,d]=1;\
                     }\
                  }\
                  d++}}\
           $1~/TMIN/{l=length($0); d=1;\
               for(i=22;i<l;i+=8){\
                  mn=substr($0,i,5)/10.0;\
                  if((mn<100)&&(mn>(-100))){tmn[Yr,Mo,d]+=mn; Nn[Yr,Mo,d]++;}\
                  d++}}\
           END{split(Coord,C,",");\
               for(day in D){\
                 if(tmn[day]!=""){d=day; gsub(SUBSEP,"/",d); \
                    print C[1],C[2],d,((tmx[day]/Nx[day])+(tmn[day]/Nn[day]))/2.0}\
               }}' ${Tstations[@]} | sort -n >> ${H}/Daily_1x1.txt

done
echo "Finished daily, starting monthly"
#################################################################################
# Monthly by sector
awk '{split($3,day,"/");\
      if(day[1]<1900){next}\
      date=day[1]+(day[2]-1)/12.0;\
      T[$1,$2,date]+=$4; Tn[$1,$2,date]++;\
      days[date]=1; LatLon[$1,$2]=1;\
     }\
     END{\
      printf("Year ");\
      for(LL in LatLon){split(LL,info,SUBSEP); printf("%d,%d ",info[1],info[2]);}\
      printf("\n");\
      for(d in days){\
         printf("%1.3f ",d);\
         for(LL in LatLon){\
            if(Tn[LL,d]!="") {\
               printf("%1.4f ",T[LL,d]/Tn[LL,d]);\
            }else{\
               printf("nan ");\
            }\
         }\
         printf("\n");\
      }\
     }' ${H}/Daily_1x1.txt | sort -n +0 -1 > ${H}/Monthly_1x1.txt

exit

About these ads

Responses

  1. Monthly_1X1.txt does not seem to be at the link you provide.
    Kind regards, Casey


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: