Monthy_1x1.txt 34Mb file. First column is date i.e. year+(month-1)/12 So, January of 1932 comes out as 1932.00, and December of 1932 is 1932.92. Remaining columns have latitude,longitude in the first row, and temperatures in Celsius thereafter. Columns are separate by a space.
Quality control:
- If a 1×1 degree sector has only one day’s worth of data, that is the temperature for that month: (TMAX+TMIN)/2. The intent is to avoid discarding data from sites that are subject to extreme weather events, or observer hardships. GISS quality control method is to discard a whole month for a site when a single day is missing, then “fill in” from surrounding stations, up to 1200km away. I believe the GISS method favors sites which are easy to observe.
- Daily minimum of maximum temperatures above 100C or below -100C were discarded. GHCN uses -9999 as a key for missing values, and occasionally a -999 would sneak in. In reality, the allowed range could be tighter to weed out any other irrational temperatures.
#!/bin/bash
# Monthly data
#wget -Nr ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v2 -o monthly.log
#find ftp.ncdc.noaa.gov/pub/data/ghcn -name \*.Z -exec uncompress '{}' \;
# Daily data
#wget -Nr --exclude-directories=grid ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily -o daily.log
H=`pwd`
cd ftp.ncdc.noaa.gov/pub/data/ghcn/daily
# Start with list of stations' coordinates & codes
# ghcnd-stations.txt: List of stations and their metadata (e.g., coordinates)
# Contents look like:
# ID Lat Long Elev State Name GSNflag HCNflag WMOid
# AJ000037749 40.7000 47.7000 95.0 GEOKCAY 37749
# Look up min & max temps from ${StationID}.dly files
# ex: "AJ000037749.dly" line 1 looks like:
# YearMo Day1 Day2 Day3 ...
# AJ000037749193602PRCP 0 I 17 I 0 I 0 I 0 I 0 I 0 I 78 I 0 I 0 I 0 I 0 I 0 I 20 I 0 I 0 I 0 I 0 I 0 I 0 I 0 I 12 I 0 I 0 I 0 I 0 I 0 I 0 I 0 I-9999 -9999
# AJ000037749200910TMAX 240 S 224 S 252 S 256 S 250 S 250 S 239 S 212 S 220 S 221 S 225 S 239 S-9999 253 S 256 S 231 S 239 S 240 S 230 S 231 S 229 S 230 S 231 S-9999 243 S-9999 181 S 187 S 212 S 170 S 173 S
# List of stations with temperature data.
# This takes some time. Delete files to freshen the lists.
if [ ! -e ${H}/TempStations.txt ] ; then
awk '(F!=FILENAME)&&($1~/TMAX/){F=FILENAME; print F > "'${H}/TempStations.txt'"}' all/*.dly
fi
# Which stations are in each sector? Format: Lat Lon StationID
# where Lat is rounded to the nearest integer.
awk 'FILENAME~/ghcn/{\
Lat[$1]=int($2+90.5); Lon[$1]=int($3+180.5);\
}\
FILENAME~/TempSta/{\
ID=substr($1,5,11);\
Station[ID]=1;\
}\
END{\
for(ID in Station)print Lat[ID]-90,Lon[ID]-180,ID;\
}' ${H}/TempStations.txt ghcnd-stations.txt > ${H}/LatLonIDs.txt
#################################################################################
# Calculate daily average for each sector.
# Calculate monthly from dailies for each sector.
echo -n "" > ${H}/Daily_1x1.txt
for Coord in `awk '{print $1","$2}' ${H}/LatLonIDs.txt | sort -u`; do
# remove the comma between Lat & Lon
C="${Coord/,/ }"
echo $C
# list of stations with these coordinates
Tstations=(`grep ^"$C" ${H}/LatLonIDs.txt | awk '{print "all/"$3".dly"}'`)
awk -v Coord="$Coord" '{St=substr($0,0,11);Yr=substr($0,12,4);Mo=substr($0,16,2);}\
$1~/TMAX/{l=length($0); d=1; \
for(i=22;i<l;i+=8){\
mx=substr($0,i,5)/10.0;\
if((mx<100)&&(mx>(-100))){\
tmx[Yr,Mo,d]+=mx; Nx[Yr,Mo,d]++;\
if(D[Yr,Mo,d]==""){\
D[Yr]++; Day[Yr,Mo,d]=1;\
}\
}\
d++}}\
$1~/TMIN/{l=length($0); d=1;\
for(i=22;i<l;i+=8){\
mn=substr($0,i,5)/10.0;\
if((mn<100)&&(mn>(-100))){tmn[Yr,Mo,d]+=mn; Nn[Yr,Mo,d]++;}\
d++}}\
END{split(Coord,C,",");\
for(day in D){\
if(tmn[day]!=""){d=day; gsub(SUBSEP,"/",d); \
print C[1],C[2],d,((tmx[day]/Nx[day])+(tmn[day]/Nn[day]))/2.0}\
}}' ${Tstations[@]} | sort -n >> ${H}/Daily_1x1.txt
done
echo "Finished daily, starting monthly"
#################################################################################
# Monthly by sector
awk '{split($3,day,"/");\
if(day[1]<1900){next}\
date=day[1]+(day[2]-1)/12.0;\
T[$1,$2,date]+=$4; Tn[$1,$2,date]++;\
days[date]=1; LatLon[$1,$2]=1;\
}\
END{\
printf("Year ");\
for(LL in LatLon){split(LL,info,SUBSEP); printf("%d,%d ",info[1],info[2]);}\
printf("\n");\
for(d in days){\
printf("%1.3f ",d);\
for(LL in LatLon){\
if(Tn[LL,d]!="") {\
printf("%1.4f ",T[LL,d]/Tn[LL,d]);\
}else{\
printf("nan ");\
}\
}\
printf("\n");\
}\
}' ${H}/Daily_1x1.txt | sort -n +0 -1 > ${H}/Monthly_1x1.txt
exit
Monthly_1X1.txt does not seem to be at the link you provide.
Kind regards, Casey
By: Casey on March 17, 2010
at 5:11 pm