ポカよけ（Poka-yoke、英：fool proof）

　ポカヨケは、英語ではフールプルーフ(fool proof=「バカ」「防ぐ」）といい、ヒューマンエラー（人間のミス）が起こったとき、それが事故につながらないように、設備的または運用上の防護対策である。
　新郷氏は当時、コンサルをしており、名古屋の山田電機でプッシュボタン作業の組み立てる際にバネを入れ忘れるといったミスの問題を検討していました。解決策は、スイッチ組み立ての作業工程を改善し、２つのステップで成り立つようにするというものでした。
１. ２つのバネをバネホルダーに置くことでそれらのバネを入れる準備を行う。
２. その２つのバネをバネホルダーから取り出し、スイッチに入れる。
　改善後、スイッチを組み立てる工程は長くなるが製造中のミスをなくすことが可能となり、結果としてより質の高い製品を生産することができるようになった。
　トヨタ生産方式の基本概念の一つに数えられ、後工程に不具合品を渡さないようにするために、自工程で品質を保証し、不具合品を製造できない、しにくくする仕組みが導入されている。

就是说在设计过程时，就考虑到人为失误，把过程设计成不容易出错的过程。
比如，在设计机器时，为了防止机器切到手，设置成需要用手去按在另外一个地方的按钮。
https://takuminotie.com/blog/2019/11/15/post-18830/

Reliability Growth Models

The objective of most reliability growth models is to account for corrective actions to estimate current and future reliability and other metrics of interest.

Reliability growth can be quantified by looking at various metrics of interest such as the increase in mean time between failures (MTBF), decrease in failure intensity, or increase in mission success probability.

Key estimates used in reliability growth management, such as demonstrated reliability, projected reliability, and estimates of the growth potential, can be expressed in terms of the MTBF, failure
intensity, or mission reliability.

Change in these values, typically as a function of test time, are collectively called reliability growth trends and usually presented as reliability growth curves, based on mathematical and statistical models
called reliability growth models.

Reliability growth models的目的是通过矫正行动来估算当前和今后的可靠性以及其他感兴趣的指标。

Reliability growth可以通过多种指标来衡量，例如MTBF的增加，失败密度的减少，或任务成功率的增加。

这些数值的改变，典型地是时间的函数，被称为可靠性增长趋势(reliability growth trends)，并且被表现为一个增长曲线。

Bayesian Belief Networks (BBN)

Bayesian Belief Network or Bayesian Network or Belief Network is a Probabilistic Graphical Model (PGM) that represents conditional dependencies between random variables through a Directed Acyclic Graph (DAG).

The main objective of these networks is trying to understand the structure of causality relations. To clarify this, let’s consider a disease diagnosis problem. With given symptoms and their resulting disease, we construct our Belief Network and when a new patient comes, we can infer which disease or diseases may have the new patient by providing probabilities for each disease. Similarly, these causality relations can be constructed for other problems and inference techniques can be applied to interesting results.

例如，我们可以根据症状和成因的疾病，构筑一个BBN，当来一个新的患者的时候，我们可以通过其罹患每种疾病的概率来推测这个新病患所患的疾病。类似的这种因果关系可以被用来解决其他问题，并且这种推理技术可以生成有趣的结果。

https://towardsdatascience.com/introduction-to-bayesian-belief-networks-c012e3f59f1b

Monte Carlo Simulation

What is Monte Carlo Simulation?

Monte Carlo Simulation, also known as the Monte Carlo Method or a multiple probability simulation, is a mathematical technique, which is used to estimate the possible outcomes of an uncertain event.

How to use Monte Carlo methods?

Regardless of what tool you use, Monte Carlo techniques involves three basic steps:

Set up the predictive model, identifying both the dependent variable to be predicted and the independent variables (also known as the input, risk or predictor variables) that will drive the prediction.
Specify probability distributions of the independent variables. Use historical data and/or the analyst’s subjective judgment to define a range of likely values and assign probability weights for each.
Run simulations repeatedly, generating random values of the independent variables. Do this until enough results are gathered to make up a representative sample of the near infinite number of possible combinations.

无论用什么工具，蒙特卡洛方法涉及如下三步：

建立预测模型，确定被预测的因变量dependent，和自变量independent(或者说，输入和输出)。
指定自变量(independent)的概率和分布，用历史数据和/或分析师的主观判断定义一个可能的数值范围并且给每个变量设置概率权重
重复运行模拟器，产生自变量(independent)的数值。重复这一过程直至收集到足够的结果，形成一个接近于无限数量可能性组合的代表性的样本。

When a Monte Carlo Simulation is complete, it yields a range of possible outcomes with the probability of each result occurring.

https://www.ibm.com/topics/monte-carlo-simulation

置信区间，置信水平和Margin of Error

1.总体参数(population parameter)
整个总体的特征，如均值，标准差等等。

2.统计值(statistic)
用来估计总体参数用的统计量

一个population有的具有无限多的元素，有的虽然有限但是如果统计所有的元素在经济上或者时间上也是不可行的。所以总体的参数是未知的
从总体抽样出样本，计算样本的统计值，样本的统计值是已知的。我们就用样本的统计值来估计总体参数。
那么估算的过程中，就要牵涉到confidence interval, confidence level和margin of error。他们分别都是什么意思呢？这里粗略地解释一下。

3.置信区间(confidence interval)
首先要取样，然后计算样本的统计量，如均值或者百分比；然后从数据中计算抽样误差；然后用样本统计量加、减抽样误差就得到了估计区间的上下两个端点——这个区间就叫做置信区间(confidence interval)。如果你估算的是个百分比P，比如家里有电视的人的百分比，这个置信区间可以通过公式计算出来：

Formula for confidence interval for population proportion — 百分比P的CI计算公式。

ˆp = the proportion in your sample (e.g. the proportion of respondents who said they watched any television at all)
Z*= the critical value of the z-distribution
n = the sample size

如果你估算的是平均值，则可以用这个公司计算置信区间：

X̄ = the sample mean
Z* = the critical value of the z-distribution
s = the sample standard deviation
√n = the square root of the sample size

其中的z是从z分布计算来的值，可以查表。与置信水平(confidence level)相关，一般置信水平是90%，95%，99%。z值可以从下表得出：

Confidence level	90%	95%	99%
alpha for one-tailed CI	0.1	0.05	0.01
alpha for two-tailed CI	0.05	0.025	0.005
z-statistic	1.64	1.96	2.57

由这个公式可以看出来，置信区间的大小和样本个数的平方根成反比。所以，取样的样本数越大，置信区间越窄。提供的数据就越精确。

4.置信水平(confidence level)：这个是统计学里=1-α。α叫statistical significance。表示如果采用相同的技术重复做某一取样并计算统计值，计算出的置信区间中包含总体的真值的概率。

5.边际误差(Margin of Error)：估计值加减margin of error就得出置信区间(confidence interval)。比如抽样得出平均通勤时间是55分钟，有一个±３分钟的margin of error，那么置信区间就是(52,58)分钟。

参考资料：
https://www.scribbr.com/statistics/confidence-interval/

FMEA

这是一种预防失败/缺陷，提高信赖性的方法。
FMEA (Failure Mode and Effects Analysis＝失败模式和效果分析)是根据系统或者process的构成要素，对有可能发生的失败的模式进行预测、考虑可能的原因以及影响，事先做出评价分析，找出设计和计划上的问题点，并采取相应的对策的方法。
这种方法主要是为了预防问题的发生。FMEA又分为DFMEA(Design＝设计)和PFMEA(Process＝过程)两种。

详细参照这里

Regression Test/Logistic Regression Test

Regression Test回归检测

※和软件测试里面的回归测试不是一码事，这里说的是统计学的回归检测也叫回归分析(Regression analysis)。

Regression analysis is a statistical technique that attempts to explore and model the relationship between two or more variables. For example, an analyst may want to know if there is a relationship between road accidents and the age of the driver.
简单说就是检测某一因素对于结果是否有影响。
即：Y=A0+A1×X1+A2×X2+…+An×Xn
Null假设是A1=A2=…=An=0
代替假设是至少有一个不等于0
测试的方法，如果是多个因素，可以用ANOVA方法，如果是一个因素可以用t-test
具体可参照这里

Logistic Regression Test逻辑回归检测

※Logistic不是物流么？好像还有符号逻辑学的意思。

和Regression Test一样，只不过这里的自变量是二值的(Binary)。参见这里。

Normality Test 正规分布检测

For each mean and standard deviation combination, a theoretical normal distribution can be determined. This distribution is based on the proportions shown below:
对于每一种均值和标准差(SD)的组合，可以检测其理论上的正规分布，也就是正规分布的比例如下图所示：

其实Normality Test也就是看1σ是不是68%，2σ是不是95%，3σ是不是99%。
检测方法有很多种。比如俺学过的K-S test就是一种方法。

详细的说明可以参见这里

Linuxメモ

時間の設定

Linux和Windows在一台机器上运行时，windows时间设置对了，linux的时间就会错，linux设置好了以后，windows又会错。究其原因，linux有些猪脑程序员修改了bios的时间设置。
改回来很简单：
timedatectl set-local-rtc 1 –adjust-system-clock
当然了，再恢复回去用这句即可：
timedatectl set-local-rtc 0
如果想通过互联网自动对时，需要ntp服务，在ubuntu里面这样安装：
sudo apt-get install ntp

rootパスワード設定

sudo passwd

怎么去掉讨厌的Keyring

某些人总以为自己很智能，比如linux中讨厌的Keyring，每次都会弹出来问你password。于是网上一堆提问如何消除该死的keyring，和一堆不着边际的解答。甚至还有人开发了一款软件专门处理这事。那些开发keyring的所谓程序猿是不是猪脑，昂？？。
话说怎么去掉呢？
首先在菜单里选“运行”，输入“pas”查找，于是就可以找到所谓的“Passwords and Keys”这个程序。不说了，上图！

双击就可以运行了，在login上面按右键，选择Change Password，然后把password给设置成空的即可。如图。

这简直是傻子开发的程序，有个鸟用。本来用linux的登录密码保护就很好。
有了它，大家都把密码设置成空，反而降低了系统的安全性。
在和人类“省事”、“方便”的斗争中，没有任何力量可以成功。人们为了省事，方便宁可牺牲生命。比如虽然有天桥，可总有人为了方便而横穿马路被撞死(话说中国那些转为跑汽车设计的道路也真是流氓设计)。