I've been thinking about this for some time but yesterday decided to make some tests on log processing to see and learn what could be the pitfalls on such a system. Not that I wanted to build such a system but because I burn my brains out doing some rather boring algorithm and needed to cool my ideas :)
Let's see what time it would take to create 500K (500000) QSO's on file then to look for uniq callsign's and finally to search for the position of a specific callsign and qso in the same file.
Not bad... 52 seconds on my slow machine...
QSO's and call signs randomly create in this format:
CALLSIGN1:CALLSIGN2:YEARmonthDAYhourMinute:RST:qth,op,
Now let's find the uniq callsigns from the QSO list, since they were random generated and due also their long size (2 leters, 1 number and 3 letters) almost none (in percentage) were duplicate, for 1Million (500K * 2) callsigns, 993864 were uniq:
OK, now we start to see that searching is fast and file creation is slow and that alone starts to explain the slow log processing (or not, since I have no clue on the type of system used), especially if they come by network, although multiple concurrent connections and process can speed it up....I'm sure DOS is not used :)
Also fast is searching 1 callsign QSO's position in file:
...Specially after buffering (file read, in this case done by the OS)... see the difference in the first iteration of the program an the subsequent ones... I am sure that looking for all call signs qso's position in the file after buffering would take less than 20 hours...
I didn't tried different algorithms to optimize the system nor I used a database something I hope LoTW uses. Also the language chosen is not the most blazing fast for this type of operation.
Here's the code used in case you need it for something...
------------//---------
Create 500k random qso's
------//-----------
// create random qso contacts bettwen random call signs and write on file... just for testing matching qso's
// by CT2GQV 2012
// Licence: use and abuse, it's free
// if you don't change the settings it will create 500K records...
// settings
set_time_limit(120); // 2 minutes... instead of 30s... only with safe mode disabled.. or change php.ini..
$contacts_file = './contacts.qsl';
$create_how_many=500000;
// may not be possible in all systems...
$start_time=microtime(true);
// one stupid way of generating rando chars...
$characters = array("A","B","C","D","E","F","G","H","J","K","L","M","N","P","Q","R","S","T","U","V","W","X","Y","Z");
// let's create
$rst_count=0;
$a=0;
// let's open the file before...
$fh = fopen($contacts_file, 'a') or die("ERROR: can't open contacts");
while ($a<$create_how_many) {
// 2 leters.... 1 number, 3 letters... for simplifity
$call1=$characters[rand(0,23)].$characters[rand(0,23)].rand(0,9).$characters[rand(0,23)].$characters[rand(0,23)].$characters[rand(0,23)];
$call2=$characters[rand(0,23)].$characters[rand(0,23)].rand(0,9).$characters[rand(0,23)].$characters[rand(0,23)].$characters[rand(0,23)];
// minimum signal is 233 :)
$rst=rand(2,5).rand(3,9).rand(3,9);
// just the creation date...
$utc=date("YmdGi");
// just for fun...
if($rst=="599"){$rst_count++;};
// remove next line if no echo is needed
// echo "$call1:$call2:$utc:$rst:Just a comment\n";
///// create file contacts.qsl beforeand and chmod to writable...
// $fh = fopen($contacts_file, 'a') or die("ERROR: can't open contacts");
$data_to_apend="$call1:$call2:$utc:$rst:qth,op,\n";
fwrite($fh, $data_to_apend);
// fclose($fh);
// add the counter...
$a++;
};
// closed only after the loop to save some time...
fclose($fh);
$end_time=microtime(true);
$time = $end_time - $start_time;
echo "\n\nDone $create_how_many contacts in $time secounds and $rst_count QSO's were 599...\n\n";
?>
--------//-----------
Find uniq call
-------//------------
// some settings
set_time_limit(120);
// the file with the QSO's
$qsl_file = './500kcontacts.qsl';
$uniq_call_file = './uniq-call.qsl';
$uniq_call_array=array();
$temp=array();
$start_time=microtime(true);
// let's loop the QSO's file
$file_handle = fopen($qsl_file, "r") or die("ERROR: can't open the QSO's file");
// were we are going to store the uniq callsigns
$file_handle2 = fopen($uniq_call_file, 'a') or die("ERROR: can't open uniq callsign file");
$count=0;
while (!feof($file_handle)) {
$lines = fgets($file_handle);
$pieces=explode(":", $lines);
if($pieces[0]!="" || $pieces[1]!=""){ // if one or the other are not empty callsigns then save...
// the only issue is an empty callsign, but rand on the creation doesn't allow :)
$temp[]=$pieces[0]; $temp[]=$pieces[1];
};
}; // end loop reading the QSO's file
fclose($file_handle);
// it's good to free before another mem request...
$uniq_call_array = array_unique($temp);
foreach ($uniq_call_array as $value) {
// echo "$value\n";
$add_to_file="$value\n";
fwrite($file_handle2, $add_to_file);
$count++;
}
fclose($file_handle2);
$end_time=microtime(true);
$time = $end_time - $start_time;
echo "\n$count Uniq callsigns list:\n";
print_r($result);
echo "In $time secounds\n";
?>
-------//---------
Find a contact from a call in the file
-------//---------
$start_time=microtime(true);
$search_call="ET3QPV";
$file = file_get_contents("./500kcontacts.qsl");
$offset = 0;
$counter = 0;
if(strpos($file, $search_call) == 0){
$counter++;
echo "\nQSO #$counter at pos: 0";
}
while($offset = strpos($file, $search_call, $offset + 1)){
$counter++;
echo "\nQSO #$counter at pos: $offset";
}
$end_time=microtime(true);
$time = $end_time - $start_time;
echo "\nFound $counter QSO's in $time secounds";
$time=$time*993864;
$hour=$time/3600;
echo "\nFor 993864 call's that should be more or less: $time Secounds... or $hour hours";
?>
Simple hum?For now I will continue to use paper and a pen for log processing...



No comments:
Post a Comment