development

일반 목록의 표준 편차?

big-blog 2020. 12. 31. 23:18
반응형

일반 목록의 표준 편차?


이 질문에 이미 답변이 있습니다.

일반 목록의 표준 편차를 계산해야합니다. 내 코드를 포함하려고합니다. 데이터가 포함 된 일반 목록입니다. 데이터는 대부분 float 및 int입니다. 다음은 자세한 내용을 다루지 않고 관련된 내 코드입니다.

namespace ValveTesterInterface
{
    public class ValveDataResults
    {
        private List<ValveData> m_ValveResults;

        public ValveDataResults()
        {
            if (m_ValveResults == null)
            {
                m_ValveResults = new List<ValveData>();
            }
        }

        public void AddValveData(ValveData valve)
        {
            m_ValveResults.Add(valve);
        }

다음은 표준 편차를 계산해야하는 함수입니다.

        public float LatchStdev()
        {

            float sumOfSqrs = 0;
            float meanValue = 0;
            foreach (ValveData value in m_ValveResults)
            {
                meanValue += value.LatchTime;
            }
            meanValue = (meanValue / m_ValveResults.Count) * 0.02f;

            for (int i = 0; i <= m_ValveResults.Count; i++) 
            {   
                sumOfSqrs += Math.Pow((m_ValveResults - meanValue), 2);  
            }
            return Math.Sqrt(sumOfSqrs /(m_ValveResults.Count - 1));

        }
    }
}

LatchStdev () 함수 내부에 무엇이 있는지 무시하십시오. 옳지 않다고 확신하기 때문입니다. st dev를 계산하려는 내 가난한 시도입니다. 복식 목록에서 수행하는 방법을 알고 있지만 일반 데이터 목록 목록은 아닙니다. 누군가이 경험이 있다면 도와주세요.


이 기사 가 도움이 될 것입니다. double시퀀스의 편차를 계산하는 함수를 생성 합니다. 적절한 데이터 요소의 시퀀스를 제공하기 만하면됩니다.

결과 함수는 다음과 같습니다.

private double CalculateStdDev(IEnumerable<double> values)
{   
  double ret = 0;
  if (values.Count() > 0) 
  {      
     //Compute the Average      
     double avg = values.Average();
     //Perform the Sum of (value-avg)_2_2      
     double sum = values.Sum(d => Math.Pow(d - avg, 2));
     //Put it all together      
     ret = Math.Sqrt((sum) / (values.Count()-1));   
  }   
  return ret;
}

이것은 계산되는 값에 대한 선택자를 제공하는 한 모든 제네릭 유형에 맞게 쉽게 조정할 수 있습니다. LINQ는 이에 적합합니다.이 Select기능을 사용하면 일반 사용자 지정 형식 목록에서 표준 편차를 계산할 숫자 값 시퀀스를 프로젝션 할 수 있습니다.

List<ValveData> list = ...
var result = list.Select( v => (double)v.SomeField )
                 .CalculateStdDev();

위의 예는 약간 부정확하며 모집단 집합이 1 인 경우 0으로 나누기 오류가 발생할 수 있습니다. 다음 코드는 다소 간단하며 "모집단 표준 편차"결과를 제공합니다. ( http://en.wikipedia.org/wiki/Standard_deviation )

using System;
using System.Linq;
using System.Collections.Generic;

public static class Extend
{
    public static double StandardDeviation(this IEnumerable<double> values)
    {
        double avg = values.Average();
        return Math.Sqrt(values.Average(v=>Math.Pow(v-avg,2)));
    }
}

Even though the accepted answer seems mathematically correct, it is wrong from the programming perspective - it enumerates the same sequence 4 times. This might be ok if the underlying object is a list or an array, but if the input is a filtered/aggregated/etc linq expression, or if the data is coming directly from the database or network stream, this would cause much lower performance.

I would highly recommend not to reinvent the wheel and use one of the better open source math libraries Math.NET. We have been using that lib in our company and are very happy with the performance.

PM> Install-Package MathNet.Numerics

var populationStdDev = new List<double>(1d, 2d, 3d, 4d, 5d).PopulationStandardDeviation();

var sampleStdDev = new List<double>(2d, 3d, 4d).StandardDeviation();

See http://numerics.mathdotnet.com/docs/DescriptiveStatistics.html for more information.

Lastly, for those who want to get the fastest possible result and sacrifice some precision, read "one-pass" algorithm https://en.wikipedia.org/wiki/Standard_deviation#Rapid_calculation_methods


I see what you're doing, and I use something similar. It seems to me you're not going far enough. I tend to encapsulate all data processing into a single class, that way I can cache the values that are calculated until the list changes. for instance:

public class StatProcessor{
private list<double> _data; //this holds the current data
private _avg; //we cache average here
private _avgValid; //a flag to say weather we need to calculate the average or not
private _calcAvg(); //calculate the average of the list and cache in _avg, and set _avgValid
public double average{
     get{
     if(!_avgValid) //if we dont HAVE to calculate the average, skip it
        _calcAvg(); //if we do, go ahead, cache it, then set the flag.
     return _avg; //now _avg is garunteed to be good, so return it.
     }
}
...more stuff
Add(){
//add stuff to the list here, and reset the flag
}
}

You'll notice that using this method, only the first request for average actually computes the average. After that, as long as we don't add (or remove, or modify at all, but those arnt shown) anything from the list, we can get the average for basically nothing.

Additionally, since the average is used in the algorithm for the standard deviation, computing the standard deviation first will give us the average for free, and computing the average first will give us a little performance boost in the standard devation calculation, assuming we remember to check the flag.

Furthermore! places like the average function, where you're looping through every value already anyway, is a great time to cache things like the minimum and maximum values. Of course, requests for this information need to first check whether theyve been cached, and that can cause a relative slowdown compared to just finding the max using the list, since it does all the extra work setting up all the concerned caches, not just the one your accessing.

ReferenceURL : https://stackoverflow.com/questions/3141692/standard-deviation-of-generic-list

반응형