|
From: | Andy Williams |
Subject: | Request for Quote |
Date: | Wed, 11 Nov 2009 15:15:31 -0500 |
Hi, I need to perform some
statistical analysis on an ongoing basis for a worldwide not for profit writing
contest I'm launching soon. If anyone is able to provide these six short
scripts described below to run on my linux web server, I would appreciate if they
would respond immediately with a quote letting me know how much it would cost
and how long it would take. I need four to six short scripts
to process statistical data extracted from a MySQL database. I can have someone
else extract this data from the MySQL database if need be. I will need to be
able to invoke these scripts be from the linux command line using a task
scheduler. Display of the data on a GUI will not be required. Code: find_maxima This code should take a frequency
distribution as input and attempt to fit the frequency distribution to a
Gaussian curve. It should extrapolate the peak of the curve to five decimal
places and return this value as the result. The frequency distribution will be
a two dimensional array of values for variables x and y, with x being a rating
from 1 to ten and y being the number of times the rating has been given. Cod: get_std_dev Given the frequency distribution described
above, I would like a script that outputs the standard deviation of the rating.
It would likely be more efficient to calculate the standard deviation as part
of find_maxima, and to simply retrieve the value here. In that case this
function will only be a shell or wrapper function. Code: get_avg Given the frequency distribution described
above, I would like a script that outputs the avg rating. It would likely be
more efficient to calculate the average as part of find_maxima, and to simply
retrieve the value here. In that case this function will only be a shell or
wrapper function. Code: detect_outlier Given an integer “z” representing
a rating as well as the frequency distribution described above, I would like a
script that calls get_std_dev to calculate the standard deviation in the
ratings and then detects whether “z” is within a standard deviation
of the mean rating. The script should return 1 or ‘true’ if the
value is greater than 1 standard deviation from the mean and 0 or
‘false’ otherwise. It would likely be more efficient to calculate
the standard deviation as part of find_maxima, and to simply retrieve the value
here. Code: detect_ratings_skew Given an array of ratings made by a
particular judge and given the frequency distribution of the overall ratings
(described above) this code will compare the count of the judge’s ratings
that are within the standard deviation of the norm to count of the judge’s
ratings that are the determined to be outliers. If more than 32% of the judges
ratings are statistical outliers then the function will indicate a skew in the
individual judges ratings by returning 1 or ‘true’. The function
will return 0 or ‘false’ otherwise. Pseudocode:
If outliers/(outliers+ratings_within_norm) > 0.32 then
Return true
Else
Return false Code: adjust_ratings Given the mean and standard deviation of the
overall ratings and the frequency distribution of ratings given by a particular
judge, this function should adjust the ratings of that judge so that they have
the same mean and standard deviation of the overall ratings. The output will be
a new two dimensional array containing each rating from one to ten and the
adjusted value of that rating. Best regards, Andy |
[Prev in Thread] | Current Thread | [Next in Thread] |