机器学习系列之:怎么样数鸡鸡?(2) 大津算法来计算阈值

in #cn7 years ago (edited)

拖延症又犯了,上周第一篇: 机器学习系列之:怎么样数鸡 ?(1) 说到了先把图片转成灰度 (Grey Scale),接下来我们要做的就是 计算阈值 (Threshold)

当我们进行图片灰度化的时候,我们把 RGBA 图片每个相素4个字节转成了 亮度 1个字节,用了以下公式:

亮度 = R * 0.21 + G * 0.72 + B * 0.07

但是这对于后面要分类还是不太方便,我们理想只需要 0 和1 , 1代表是鸡,0代表是空气,所以这一步计算阈值就是要计算出一个值,然后之后根据这个 阈值 (Threshold) 来二分成 0 或 1。比如 我们已经计算得到 threshold 然后可以遍历图片的每个相素,依次按 RGB 三个值的平均计算得到当前的密度(Intensity) 和这个阀值判断二分成前景和背景(鸡和空气) :

本文的C#代码来自于 微软大神 Gary Short

大津算法 Otsu's Method

大津算法(Otsu's Method)是计算机图形学上用来把一个灰度图片退化为黑白(二值)图像。具体的算法可以看 WIKI,这里就不多说明了。

假设以下方法用于得到这个阀值。

private static int GetThreshold(Bitmap image)

参数为图片(灰度),首先我们得把图片变成字节数组(每个相素为1个字节8位)

byte[] imageBytes =  (byte[])new ImageConverter().ConvertTo(image, typeof(byte[]));

然后这个大津算法需要图片的 Histogram (直方图),横坐标是0到255个格子,纵坐标是出现的次数。

int[] histogram = new int[256];
int ptr = 0;
while (ptr < imageBytes.Length)  {
    int h = 0xFF & imageBytes[ptr];
    histogram[h]++;
    ptr++;
}

把出现的次数计算成概率:

int totalPixels = imageBytes.Length;  // 相素点数
float sumOfIntensities = 0;
for (int t = 0; t < 256; t++) sumOfIntensities += t * histogram[t];

接下来就是大津算法得到这个阀值使得两类内方差(Variance)最小,也就是类间(Intra Class)方差最大

// 穷举阀值得到能使内间类别区别最大的值
float sumOfBackgroundIntensities = 0;
int backgroundWeight = 0;
int foregroundWeight = 0;
float maximumInterClassVariance = 0;
int threshold = 0;
for (int t = 0; t < 256; t++)
{
    backgroundWeight += histogram[t];
    if (backgroundWeight == 0) continue;

    foregroundWeight = totalPixels - backgroundWeight;
    if (foregroundWeight == 0) break;

    sumOfBackgroundIntensities += (float)(t * histogram[t]);

    float backgroundMean =
        sumOfBackgroundIntensities / backgroundWeight;

    float foregroundMean =
        (sumOfIntensities - sumOfBackgroundIntensities) / foregroundWeight;

    // 计算内间方差
    float intraClassVariance =
        (float)backgroundWeight * (float)foregroundWeight *
            (backgroundMean - foregroundMean) *
            (backgroundMean - foregroundMean);

    // 更新最大类间方差值
    if (intraClassVariance > maximumInterClassVariance)
    {
        maximumInterClassVariance = intraClassVariance;
        threshold = t;
    }
}
return threshold;

计得上周的这个鸡图么?

来,课后作业,留言告诉我,这张图的 阈值 是多少,前三名答对的每人奖励 3 SBD。

未完待续……


@justyy 是CN 区的点赞机器人,对优质内容进行点赞,只要代理给 @justyy 每天收利息(100 SP 每天0.04 SBD)并且能获得一次相应至少2倍的点赞,可以认为是VP 200%+ ,详细请看:

中文区的大鱼 @htliao 都加入了这个计划(530 SP)表示支持,您还不来试试么?

@justyyhttps://justyy.com 的博主 - 西半球知名的“土豪”博主。在大哥 @tumutanzi 的带领下加入了 STEEMIT。感谢您阅读 @justyy 的帖子 希望得到您的follow、upvote 或者是 reply 。

大津算法来计算阈值
机器学习系列之:怎么样数鸡 ?(1)

欢迎您发表高见,我会赞赏(Upvote)高质量的评论。

Steemit 在线工具和API接口
SteemIt Tools and APIs

Sort:  

@mrainp420 has voted on behalf of @minnowpond. If you would like to recieve upvotes from minnowpond on all your posts, simply FOLLOW @minnowpond. To be Resteemed to 4k+ followers and upvoted heavier send 0.25SBD to @minnowpond with your posts url as the memo

you are a great programmer, i want to learn a lot with you especially about security, can i can communicate directly through chat with you, what media chat i can communicate with you, ...

Congratulations @justyy! You have completed some achievement on Steemit and have been rewarded with new badge(s) :

Award for the number of comments received

Click on any badge to view your own Board of Honor on SteemitBoard.
For more information about SteemitBoard, click here

If you no longer want to receive notifications, reply to this comment with the word STOP

By upvoting this notification, you can help all Steemit users. Learn how here!

@royrodgers has voted on behalf of @minnowpond. If you would like to recieve upvotes from minnowpond on all your posts, simply FOLLOW @minnowpond. To be Resteemed to 4k+ followers and upvoted heavier send 0.25SBD to @minnowpond with your posts url as the memo

Coin Marketplace

STEEM 0.15
TRX 0.12
JST 0.026
BTC 56766.86
ETH 2492.90
USDT 1.00
SBD 2.36