I've been thinking about this for some time but yesterday decided to make some tests on log processing to see and learn what could be the pitfalls on such a system. Not that I wanted to build such a system but because I burn my brains out doing some rather boring algorithm and needed to cool my ideas :)
Let's see what time it would take to create 500K (500000) QSO's on file then to look for uniq callsign's and finally to search for the position of a specific callsign and qso in the same file.
Not bad... 52 seconds on my slow machine...
QSO's and call signs randomly create in this format:
CALLSIGN1:CALLSIGN2:YEARmonthDAYhourMinute:RST:qth,op,
Now let's find the uniq callsigns from the QSO list, since they were random generated and due also their long size (2 leters, 1 number and 3 letters) almost none (in percentage) were duplicate, for 1Million (500K * 2) callsigns, 993864 were uniq:
OK, now we start to see that searching is fast and file creation is slow and that alone starts to explain the slow log processing (or not, since I have no clue on the type of system used), especially if they come by network, although multiple concurrent connections and process can speed it up....I'm sure DOS is not used :)
Also fast is searching 1 callsign QSO's position in file:
...Specially after buffering (file read, in this case done by the OS)... see the difference in the first iteration of the program an the subsequent ones... I am sure that looking for all call signs qso's position in the file after buffering would take less than 20 hours...
I didn't tried different algorithms to optimize the system nor I used a database something I hope LoTW uses. Also the language chosen is not the most blazing fast for this type of operation.
Here's the code used in case you need it for something...
------------//---------
Create 500k random qso's
------//----------- // create random qso contacts bettwen random call signs and write on file... just for testing matching qso's // by CT2GQV 2012 // Licence: use and abuse, it's free // if you don't change the settings it will create 500K records... // settings set_time_limit(120); // 2 minutes... instead of 30s... only with safe mode disabled.. or change php.ini.. $contacts_file = './contacts.qsl'; $create_how_many=500000; // may not be possible in all systems... $start_time=microtime(true); // one stupid way of generating rando chars... $characters = array("A","B","C","D","E","F","G","H","J","K","L","M","N","P","Q","R","S","T","U","V","W","X","Y","Z"); // let's create $rst_count=0; $a=0; // let's open the file before... $fh = fopen($contacts_file, 'a') or die("ERROR: can't open contacts"); while ($a<$create_how_many) { // 2 leters.... 1 number, 3 letters... for simplifity $call1=$characters[rand(0,23)].$characters[rand(0,23)].rand(0,9).$characters[rand(0,23)].$characters[rand(0,23)].$characters[rand(0,23)]; $call2=$characters[rand(0,23)].$characters[rand(0,23)].rand(0,9).$characters[rand(0,23)].$characters[rand(0,23)].$characters[rand(0,23)]; // minimum signal is 233 :) $rst=rand(2,5).rand(3,9).rand(3,9); // just the creation date... $utc=date("YmdGi"); // just for fun... if($rst=="599"){$rst_count++;}; // remove next line if no echo is needed // echo "$call1:$call2:$utc:$rst:Just a comment\n"; ///// create file contacts.qsl beforeand and chmod to writable... // $fh = fopen($contacts_file, 'a') or die("ERROR: can't open contacts"); $data_to_apend="$call1:$call2:$utc:$rst:qth,op,\n"; fwrite($fh, $data_to_apend); // fclose($fh); // add the counter... $a++; }; // closed only after the loop to save some time... fclose($fh); $end_time=microtime(true); $time = $end_time - $start_time; echo "\n\nDone $create_how_many contacts in $time secounds and $rst_count QSO's were 599...\n\n"; ?>
--------//-----------
Find uniq call
-------//------------
// some settings set_time_limit(120); // the file with the QSO's $qsl_file = './500kcontacts.qsl'; $uniq_call_file = './uniq-call.qsl'; $uniq_call_array=array(); $temp=array(); $start_time=microtime(true); // let's loop the QSO's file $file_handle = fopen($qsl_file, "r") or die("ERROR: can't open the QSO's file"); // were we are going to store the uniq callsigns $file_handle2 = fopen($uniq_call_file, 'a') or die("ERROR: can't open uniq callsign file"); $count=0; while (!feof($file_handle)) { $lines = fgets($file_handle); $pieces=explode(":", $lines); if($pieces[0]!="" || $pieces[1]!=""){ // if one or the other are not empty callsigns then save... // the only issue is an empty callsign, but rand on the creation doesn't allow :) $temp[]=$pieces[0]; $temp[]=$pieces[1]; }; }; // end loop reading the QSO's file fclose($file_handle); // it's good to free before another mem request... $uniq_call_array = array_unique($temp); foreach ($uniq_call_array as $value) { // echo "$value\n"; $add_to_file="$value\n"; fwrite($file_handle2, $add_to_file); $count++; } fclose($file_handle2); $end_time=microtime(true); $time = $end_time - $start_time; echo "\n$count Uniq callsigns list:\n"; print_r($result); echo "In $time secounds\n"; ?>
-------//---------
Find a contact from a call in the file
-------//---------
$start_time=microtime(true); $search_call="ET3QPV"; $file = file_get_contents("./500kcontacts.qsl"); $offset = 0; $counter = 0; if(strpos($file, $search_call) == 0){ $counter++; echo "\nQSO #$counter at pos: 0"; } while($offset = strpos($file, $search_call, $offset + 1)){ $counter++; echo "\nQSO #$counter at pos: $offset"; } $end_time=microtime(true); $time = $end_time - $start_time; echo "\nFound $counter QSO's in $time secounds"; $time=$time*993864; $hour=$time/3600; echo "\nFor 993864 call's that should be more or less: $time Secounds... or $hour hours"; ?>Simple hum?
For now I will continue to use paper and a pen for log processing...
No comments:
Post a Comment