previous up next
Up: Home

統計処理関数集 / Clojure Functions for Statistics

丸井淳史 / MARUI Atsushi

以下はClojureプログラミングの勉強として作成した統計処理用の関数群です。もともとCommon Lisp用に作ったものだったのですが、Clojureに移植するにあたってよりシンプルな記述になったのが印象的でした。

自分の学習(と自己満足)のために作っているものなので、機能的には不十分かつ不完全です。ちゃんとした統計ライブラリとしてはIncanterなどがありますので、そちらをご利用ください。

The following functions are parts of a statistics library as an etude for learning Clojure programming language. The functions were originally made for Common Lisp, and it was enjoyable to see how much code reduction can be achieved using Clojure.

I am programming this for my learning enjoyment of the Clojure language, so the program is nowhere near complehensive or complete. There are statistics libraries, such as Incanter, for serious statistical works.

;; Sum
;; (sum [1 2 3 4]) ==> 10
;;
(defn sum [v]
  (apply + v))

;; Product
;; (prod [1 2 3 4]) ==> 24
;;
(defn prod [v]
  (apply * v))

;; Factorial
;; (factorial 10) ==> 3628800
;;
(defn factorial [x]
  (if (= x 0)
      1
      (* x (factorial (- x 1)))))

;; Combination (binomial coefficient)
;; (nchoosek 9 5) ==> 126
;;
(defn nchoosek [n k]
  (if (or (> 0 k) (> 0 n) (> k n))
      nil
      (/ (factorial n) (factorial k) (factorial (- n k)))))

;; Arithmetic Mean
;; (mean [1 2 3 4]) ==> 5/2
;; (float (mean [1 2 3 4])) ==> 2.5
;;
(defn mean [v]
  (/ (sum v) (count v)))

;; Geometrc Mean
;; (geomean [1 2 3 4]) ==> 2.213364
;;
(defn geomean [v]
  (Math/pow (prod v) (/ 1 (count v))))

;; Sum of Squares
;; (sum-of-squares [1 2 3 4]) ==> 30
;;
(defn square [x]
  (* x x))
(defn sum-of-squares [v]
  (sum (map square v)))

;; Inner Product
;; (inner-product [1 2 3 4] [5 6 7 8]) ==> 70
;;
(defn inner-product [v1 v2]
  (apply + (map * v1 v2)))

;; Population Variance
;; (varp [1 2 3 4]) ==> 5/4
;;
(defn varp [v]
  (/ (sum (map square (map #(- % (mean v)) v))) (count v)))

;; Sample Variance
;; (vars [1 2 3 4]) ==> 5/3
;;
(defn vars [v]
  (/ (* (varp v) (count v)) (- (count v) 1)))

;; Population Standard Variance
;; (stddevp [1 2 3 4]) ==> 1.118033988749895
;;
(defn stddevp [v]
  (Math/sqrt (varp v)))

;; Sample Standard Variance
;; (stddevs [1 2 3 4]) ==> 1.2909944487358058
;;
(defn stddevs [v]
  (Math/sqrt (vars v)))

;; Standardization
;; (zscore [1 2 3 4]) ==> (-1.3416407864998738 -0.4472135954999579 0.4472135954999579 1.3416407864998738)
;;
(defn zscore-sub [x y mu sigma]
  (if (= x [])
      y
      (zscore-sub (rest x) (cons (/ (- (first x) mu) sigma) y) mu sigma)))
(defn zscore [v]
  (if (= v [])
      nil
      (reverse (zscore-sub v [] (mean v) (stddevp v)))))

;; Normal Distribution
;; (normpdf 2 0 1) => 0.05399096651318806
;;
(defn normpdf [x mu sigma]
  (* (/ 1 (* (Math/sqrt (* 2 Math/PI)) sigma)) (Math/exp (- (/ (Math/pow (- x mu) 2) (* 2 sigma sigma))))))

;; Binomial Distribution
;; (binopdf 3 10 0.4) => 0.21499084799999998
;; 
(defn binopdf [x n p]
  (* (nchoosek n x) (Math/pow p x) (Math/pow (- 1 p) (- n x))))

;; Poisson Distribution
;; (poisspdf 0 1) => 0.36787944117144233
;; x: number of happenings
;; p: lambda parameter
;;
(defn poisspdf [x p]
  (/ (* (Math/pow p x) (Math/exp (- p))) (factorial x)))



MARUI Atsushi
2013-01-12