小脑萎缩吃什么药好| 甲亢甲减有什么症状| 秀气是什么意思| 肝穿刺检查是什么意思| 玫瑰金是什么颜色| 腹泻能吃什么| 5.25是什么星座| 什么含胶原蛋白最多| 农历六月十一是什么星座| aug什么意思| 当驾校教练需要什么条件| 10.28什么星座| 云南简称是什么| 心影饱满是什么意思| 促什么谈什么| 7月一日是什么节日| 舌苔白吃什么药| 干咳无痰是什么原因引起的| 穆斯林为什么不吃猪肉| 一抹是什么意思| 干爹是什么意思| 聊是什么意思| 肝是干什么用的| 什么是基数| 舌尖发麻是什么问题| 1993属什么生肖| 湿气重看中医挂什么科| 胃炎胃溃疡吃什么药| 舌头有齿痕是什么原因| 女性腹部彩超检查什么| 尿里有潜血是什么原因| 脖子上长痘痘什么原因| 嘴巴有异味是什么原因| 月经是什么意思| 教师节应该送老师什么花| 地接是什么意思| 存货是什么| 白猫进家有什么预兆| 写生是什么意思| 心属于五行属什么| 什么是相位| 有什么好听的网名| 小孩脚后跟疼是什么原因| 走读是什么意思| 狗肉和什么一起炖最佳| 加拿大用什么货币| 舌头有问题看什么科| 玉和石头有什么区别| 1990年是什么年| 梦见发工资了是什么意思| 刮痧是什么| 贫嘴什么意思| 9月是什么星座的| 为什么喉咙总感觉有东西堵着| 出什么什么什么| 劳伦拉夫属于什么档次| 补锌吃什么| 人为什么会生气| 甲亢不能吃什么| 头发长的快是什么原因| 南京菜属于什么菜系| 抗核抗体谱检测查什么的| 糖耐量受损是什么意思| 小便多是什么原因男性| 贩子是什么意思| 麦粒肿是什么原因引起的| 轻微脑震荡有什么症状| 丹宁蓝是什么颜色| 紫菜吃多了有什么坏处| 摩罗丹主要治什么胃病| 星盘是什么| 念珠菌和霉菌有什么区别| 灰色配什么颜色好看| 羊水多是什么原因造成的| 什么的舞蹈| 肾阳虚女性什么症状| 12月10号是什么星座| 绝经是什么意思| 异类是什么意思| 枸杞喝多了有什么坏处| 蓝莓不能和什么一起吃| 艾滋病中期有什么症状| 孙五行属什么| imp是什么意思| 今夕何夕是什么意思| 比卡丘什么意思| 咸鸭蛋为什么会出油| vb是什么| 甲状腺结节什么症状| 清点是什么意思| 汤伤用什么药| 腿凉是什么原因引起的| 二本是什么学历| 腋下异味看什么科| 7月22日是什么星座| 目不暇接的意思是什么| 故作矜持的意思是什么| 双土是什么字| 月经不规律是什么原因| 老人流口水是什么原因引起的| 流鼻血看病挂什么科| spiderking是什么牌子| 土豆不能和什么食物一起吃| 甲亢吃什么好的更快| 大学硕士点是什么意思| 冗长什么意思| 腿麻脚麻用什么药能治| 二月出生是什么星座| 铁锈用什么能洗掉| 公测是什么意思| 杨梅吃了有什么好处| aoa是什么意思| csk是什么品牌| 搬新家有什么讲究和准备的| 肠子粘连有什么办法解决| 头孢治什么| 梦见被蛇缠身是什么意思| 脑萎缩是什么意思| 左肾盂分离是什么意思| nmol是什么单位| 飞机是什么| 打完除皱针注意事项有什么| 什么菜好消化又养胃| 1989是什么生肖| 剁椒鱼头是什么鱼头| 泰格豪雅属于什么档次| 檀香是什么味道| 甲是什么生肖| 癫痫病是什么原因引起的| 直白是什么意思| 梦见挖红薯是什么意思| 什么是造影手术| choice是什么意思| 脚后跟疼痛是什么原因| 中药什么时候喝| 珠颈斑鸠吃什么| 黑枸杞有什么功效| 维生素c十一什么意思| 膝盖积液挂什么科| 急性会厌炎吃什么药| 营养不良吃什么药| 河南为什么简称豫| 人生导师是什么意思| 韧带是什么样子图片| 什么是全脂奶粉| 矢的意思是什么| 报考军校需要什么条件| 子宫内膜2mm说明什么| 月经期间同房有什么危害| 什么牙膏好用| 近亲结婚生的孩子会得什么病| 百香果和什么不能一起吃| 蛇缠腰是什么病怎么治| 抽血抽不出来是什么原因| 痛经吃什么止疼药| 没有润滑剂可以用什么代替| 2004年是什么年| 眼睛干痒用什么眼药水比较好| 小拇指长痣代表什么| 嗜碱性粒细胞偏低说明什么| 者羽念什么| 艳阳高照是什么生肖| 子宫长什么样| 大败毒胶囊主治什么病| alp医学上是什么意思| 伤风胶囊又叫什么| 芳华什么意思| 12.16是什么星座| 落花流水什么意思| 身上有淤青是什么原因| 中国的国服是什么服装| 面红耳赤是什么生肖| 人为什么会脸红| 莫非的近义词是什么| 吃什么能快速补血| 垂问是什么意思| 反物质是什么| 2003年属羊的是什么命| 红色属于五行属什么| 杜松子是什么| 彩虹像什么| 尿次数多是什么原因| 为什么脸上会长痘痘| 36什么意思| 什么是催眠| exo什么时候出道的| 红五行属性是什么| 什么叫法令纹| 八卦中代表雷的卦象叫什么| 卤门什么时候闭合| 反酸是什么感觉| 阑尾炎痛起来什么感觉| 后脑勺发热是什么原因| 白话文是什么意思| 画龙点睛什么意思| 银屑病用什么药膏| 嘴唇出血是什么原因| 川普是什么意思| 刀厄痣是什么意思| 孤单是什么意思| 肝多发小囊肿什么意思| 什么像什么| led灯是什么灯| 什么是三好学生| 河豚是什么意思| 清一色是什么意思| 什么的道理| 烟雾病是什么原因引起的| 父亲节该送什么礼物| 24节气分别是什么| 什么命要承受丧子之痛| coser什么意思| 周杰伦是什么星座| 拔牙后吃什么| oppo是什么牌子| 尿气味很重是什么原因| 什么是亲子鉴定| 左眼跳是什么预兆| 眼眶疼是什么原因| 做肺部ct挂什么科| 为什么会抑郁| 为什么想吃甜食| 怀孕有什么特征和反应| 脸麻手麻是什么原因| 梦见吃水饺是什么预兆| 为什么会晒黑| 痛风什么药止痛最快| 双侧卵巢显示不清是什么意思| 为什么一热身上就痒| 例假淋漓不尽是什么原因造成的| 18点是什么时辰| 小暑是什么| 李世民是什么民族| 曹操的父亲叫什么名字| 男命正印代表什么| 经常流鼻血是什么病| 北京为什么这么热| 等离子体是什么| 11月16日是什么星座| 3.30是什么星座| 稠的反义词是什么| 盐冻虾是什么意思| 梦见老公怀孕什么预兆| 空腹喝牛奶为什么会拉肚子| 内伤湿滞什么意思| 糙皮病是什么病| 尿液弱阳性什么意思| 老是放屁是什么原因| 跟腱炎吃什么药| 月经期间吃什么补气血| 貂蝉原名叫什么| 胃溃疡吃什么好| 3月23日什么星座| 焕字五行属什么| 肺气肿吃什么食物好| 通勤什么意思| palace是什么牌子| 推什么出什么| 菊花茶泡了为什么会变绿| 月经什么时候来| 肛塞是什么| 咖啡渣子有什么用途| 什么情况下要打狂犬疫苗| 三高不能吃什么食物| 百度

2025-08-14

A bad workman blames his tools

One of the most depressing things about being a programmer is the realization that your time is not entirely spent creating new and exciting programs, but is actually spent eliminating all the problems that you yourself introduced.

This process is called debugging. And on a daily basis every programmer must face that fact that as they write code, they write bugs. And when they find that their code doesn't work, they have to go looking for the problems they created for themselves.

To deal with this problem the computer industry has built up an enormous amount of scar tissue around programs to make sure that they do work. Programmers use continuous integration, unit tests, assertions, static code analysis, memory checkers and debuggers to help prevent and help find bugs. But bugs remain and must be eliminated by human reasoning.

Some programming languages, such as C, are particularly susceptible to certain types of bugs that appear and disappear at random, and once you try figuring out what's causing them they disappear. These are sometimes called heisenbugs because as soon as you go searching for them they vanish.

These bugs can appear in any programming language (and especially when writing multi-threaded code where small changes in timing can uncover or cover race conditions). But in C there's another problem: memory corruption.

Whatever the cause of a bug the key steps in finding an eliminating a bug are:

  1. Find the smallest possible test case that tickles the bug. The aim is to find the smallest and fastest way to reproduce the bug reliably. With heisenbugs this can be hard, but even a fast way to reproduce it some percentage of the time is valuable.

  2. Automate that test case. It's best if the test case can be automated so that it can be run again and again. This also means that the test case can become part of your program's test suite once the bug is eliminated. This'll stop it coming back.

  3. Debug until you find the root cause. The root cause is vital. Unless you fully understand why the bug occurred you can't be sure that you've actually fixed it. It's very easy to get fooled with heisenbugs into thinking that you've eliminated them, when all you've done is covered them up.

  4. Fix it and verify using #2.


Yesterday, a post appeared on Hacker News entitled When you see a heisenbug in C, suspect your compiler’s optimizer. This is, simply put, appalling advice.

The compiler you are using is likely used by thousands or hundreds of thousands of people. Your code is likely used by you. Which is more likely to have been shaken out and stabilized?

In fact, it's a sign of a very poor or inexperienced programmer if their first thought on encountering a bug is to blame someone else. It's tempting to blame the compiler, the library, or the operating system. But the best programmers are those who control their ego and are able to face the fact that it's likely their fault.

Of course, bugs in other people's code do exist. There's no doubt that libraries are faulty, operating systems do weird things and compilers do generate odd code. But most of the time, it's you, the programmer's fault. And that applies even if the bug appears to be really weird.

Debugging is often a case of banging your head against your own code repeating to yourself all of the impossible things that can't ever happen in your code until one of those impossible things turns out to be possible and you've got the bug.

The linked article contains an example of exactly what not to conclude:

“OK, set your optimizer to -O0,”, I told Jay, “and test. If it fails to segfault, you have an optimizer bug. Walk the optimization level upwards until the bug reproduces, then back off one.”

All you know from changing optimization levels is that optimization changes whether the bug appears or not. That doesn't tell you the optimizer is wrong. You haven't found the root cause of your bug.

Since optimizers perform all sorts of code rearrangement and speed ups changing optimizer levels is very likely to change the presence or absence of a heisenbug. That doesn't make it the optimizer's fault; it's still almost certainly yours.

Here's a concrete example of a simple C program that contains a bug that appears and disappears when optimization level is changed, and exhibits other odd behavior. First, here's the program:

#include <stdlib.h>

int a()
{
int ar[16];

ar[20] = (getpid()%19==0);
}

int main( int argc, char * argv[] )
{
int rc[16];

rc[0] = 0;

a();

return rc[0];
}

Build this with gcc under Mac OS X with the following simple Makefile (I saved it in a file called odd.c):

CC=gcc
CFLAGS=

odd: odd.o

And here's a simple test program for run it 20 times and print the return code:

#!/bin/bash

for i in {0..20}
do
./odd ; echo -n "$? "
done
echo

If you run that test program you'd expect a string of zeroes, because rc[0] is never set to anything other than zero in the program. Yet here's sample output:

$ ./test
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

If you are an experienced C programmer you'll see how I made that 1 appear (and why it appears at different places), but let's try to debug with quick a printf

[...]
rc[0] = 0;

printf( "[%d]", rc[0] );

a();
[...]

Now when you run the test program the bug is gone:

$ ./test
[0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0
[0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0 [0]0

Weird, so you move the printf:

[...]
rc[0] = 0;

a();

printf( "[%d]", rc[0] );
[...]

and get the same odd result of a disappearing bug. And the same thing happens if you turn the optimizer on even without the printfs (this is the opposite of the situation in the linked article):

$ make CFLAGS=-O3
gcc -O3 -c -o odd.o odd.c
gcc odd.o -o odd
$ ./test
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

This all came about because the function a() allocates a 16 integer array called ar and then promptly writes past the end of it either 1 or 0 depending on whether the PID of the process is divisible by 19 or not. It ends up writing on top of rc[0] because of the arrangement of the stack.

Adding printfs or changing optimization level changes the layout of the code and causes the bad write to not hit rc[0]. But beware! The bug hasn't gone, it's just writing on some other bit of memory.

Because C programs are suspectible to this sort of error it's vital that good tools are used to check for problems. For example, the static code check splint and the memory analyzer valgrind help eliminate tons of nasty C bugs. And you should build your software with the maximum warning level (I prefer warn-as-error) and eliminate them all.

Only once you've done all that should you start to suspect someone else's code. And even when you do, you need to follow the same steps to reproduce the bug and get to the root cause. Most of the time, unfortunately, bugs are your fault.

2025-08-14

If you're searching remember your TF-IDF

Some people seem to be very good at searching the web, others seem to be very poor at it. What differentiates them? I think it's unconcious knowledge of something called TF-IDF (or term frequency-inverse document frequency).

If you clicked through to that Wikipedia link you were probably confronted by a bunch of mathematics, and since you are reading this you probably hit the back button as quickly as possible. But knowing about TF-IDF requires no mathematical knowledge at all. All you need is some common sense.

Put yourself in the shoes of a search engine. Sitting on the hard disks of its vast collection of computers are all the web pages in existence (or almost). Along comes a query from a human.

The first thing the search engine does is discard words that a too common. For example, if the search query contained the word 'the' there's almost no point using it to try to distinguish web pages. All the English ones almost certainly contain the word 'the' (just look at this one and count them).

With the common words removed the search engines goes looking for pages that match the remaining terms and ranks them in some useful order (a lot of Google's success is based on their ranking algorithm). One thing the search engine can take into account is how common the remaining words in the search query are.

For example, suppose the query was "the first imagineer". The search engine ignores 'the' and looks for pages containing "first imagineer". Obviously the results returned need to contain both words, but 'imagineer' is special: it's a rare word. And relatively rare words are a human's best searching friends.

A rare word allows the search engine to cut down the number of pages it needs to examine enormously, and that ends up giving the user better results. The ideal rare word is one that appears almost only on the sort of pages the end user is looking for, and appears in those pages frequently.

In nerdy terms 'appears in those pages frequently' is the TF (or term frequency), and 'almost only in the sort of pages end user is looking for' is the IDF (inverse document frequency).

Since 'imagineer' is a rare word, if the search engine finds a page on which that word occurs many times it's more likely to be relevant to the person searching that on a page where 'imagineer' appears only a few times.

Since 'first' is fairly common it's contribution to the search results is less clear. If 'first' appears many times on the page, but 'imagineer' only once then it's likely that the page is of lesser interest.

When you are searching give a few seconds thought to TF-IDF and ask yourself 'what words are most likely to appear only in the sort of pages I am looking for?' You'll likely get to where you wanted to go much faster that way.

PS If you find out who the first imagineer was, drop me a line.
no.是什么意思 孕吐一般从什么时候开始 右乳钙化灶是什么意思 下午1点是什么时辰 桑寄生有什么功效
emba是什么意思 丝丝入扣是什么意思 血糖高一日三餐吃什么东西最适合 出岫是什么意思 能说会道是什么生肖
act什么意思 青岛属于什么气候 拉不出屎吃什么药 补钙吃什么好 脑梗前期有什么症状
梦到数钱代表什么预兆 吸允的读音是什么 西药是什么药 糖链抗原是什么意思 打蛋白针有什么作用
肾结石喝酒有什么影响chuanglingweilai.com 1988年属什么今年多大hcv8jop0ns8r.cn 致密是什么意思1949doufunao.com 什么加什么等于红色hcv8jop5ns1r.cn 鸭子喜欢吃什么hcv7jop9ns9r.cn
硅对人体有什么危害hcv8jop2ns1r.cn 5公里25分钟什么水平hcv9jop1ns3r.cn 分泌物过氧化氢阳性是什么意思hcv8jop4ns6r.cn 补体c1q偏低说明什么zhiyanzhang.com 应用心理学是什么hcv9jop5ns1r.cn
莞字五行属什么hcv9jop5ns5r.cn 5月30日是什么星座hcv9jop3ns5r.cn 木薯淀粉可以做什么0735v.com 为什么不要看电焊火花hcv9jop3ns7r.cn 淡奶是什么naasee.com
宫腔粘连带是什么意思hcv9jop6ns6r.cn 每天流鼻血是什么原因hcv7jop6ns2r.cn 96122是什么电话hcv8jop2ns2r.cn 三阳开泰是什么意思hcv9jop2ns8r.cn 排卵期出血是什么原因引起的hcv9jop1ns7r.cn
百度