gitlab使用说明

这是gitlab搭建起来后,为团队内部写的简单配置说明。

Step1: Use ssh-keygen to generate a new pair of id_rsa_new / id_rsa_new.pub

1
2
cd ~/.ssh
ssh-keygen -t rsa -C "tanhao2013@foxmail.com" # your email

step1

Step2: Add the ssh key to the gitlab

1
cat gitlab_rsa.pub

step2

使用gitlab搭建代码仓库

我来之前,公司一直用windows server搭建的svn托管代码,每次都要手动远程登陆到服务器然后后台授权。我看网上很多类github的平台,于是选了gitlab实验推荐大家迁移到git上来。

1. 搭建脚本很简单,下载安装包,启动即可。

1
2
3
4
5
6
7
curl -O https://downloads-packages.s3.amazonaws.com/centos-6.6/gitlab-7.6.1_omnibus.5.3.0.ci.1-1.el6.x86_64.rpm

yum install openssh-server postfix cronie

service postfix start && chkconfig postfix on

rpm -i gitlab-7.6.1_omnibus.5.3.0.ci.1-1.el6.x86_64.rpm

然后按说明配置一下gitlab.rb,启动服务即可。注意8080端口和ssh端口转发。

2. 用docker来更新最新版本的gitlab

Updated 2016-03-24

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
## 修改防火墙
iptables -A INPUT -m state --state NEW -p tcp --dport 10022 -j ACCEPT
iptables -A INPUT -m state --state NEW -p tcp --dport 8080 -j ACCEPT
service iptables save
service iptables status
service iptables restart
iptables -L

service docker restart
docker run --detach \
--hostname gitlab.example.com \
--env GITLAB_OMNIBUS_CONFIG="external_url 'http://119.*.*.*/'; gitlab_rails['lfs_enabled'] = true;" \
-p 443:443 -p 8080:80 -p 10022:22 \
--name gitlab \
--restart always \
--volume /srv/gitlab/config:/etc/gitlab \
--volume /srv/gitlab/logs:/var/log/gitlab \
--volume /srv/gitlab/data:/var/opt/gitlab \
gitlab/gitlab-ce:latest

rChart and morris.js

1. Using rChart and morris.js for time series visualization

Here are the codes!

library('rCharts', 'ramnathv')
df <- read.csv("type.1h.csv",header=FALSE,stringsAsFactors=FALSE)
colnames(df) <- c("date","1", "2","3","4","5","6","7","8","9","10")
transform(df, date = as.character(date))
m1 <- mPlot(x = "date", y = c("1", "2","3","4","5","6","7","8","9","10"), type = "Line", data = df)
m1$set(pointSize = 0, lineWidth = 1)
m1$print("chart2")
m1
#base64enc
# install.packages("base64enc")
library("base64enc")
m1$save('graph1.html', 'inline', cdn=TRUE)
#m1$save('graph1.html', 'inline', standalone=TRUE)

2. Graph

Here, we can see this is a test graph!

Here is a test image!

Here is a test page!
graph1.html

服务器重装centOS设定静态ip

电信机房5台托管服务器被攻击,为安全起见,老板令我需要重装系统。可采用光盘刻录或u盘安装方法(过程从略)。

1. 重装系统

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
reboot(crtl+allt+delete) 
->
F2(system manage)
->
F11(BIOS Menu)
->
BIOS Boot Setting
->
Boot Sequence
->
# 光盘或U盘
COD(DVD or U Driver)
->
OK
->
Install from video / U Driver
->
No Test
->
Basic
->
Fresh
->
Use All
->
Basic Server
->
reboot

2. 设置固定IP上网

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#### 内网或外网ip
IPADDR=192.168.1.201

#### 2.1 网关配置
cp /etc/sysconfig/network /etc/sysconfig/network.bak
echo "
NETWORKING=yes
NETWORKING_IPV6=yes
GATEWAY=192.168.1.1
" >> /etc/sysconfig/network

#### 2.2 网卡配置
cp /etc/sysconfig/network-scripts/ifcfg-em1 /etc/sysconfig/network-scripts/ifcfg-em1.bak
sed -i 's/BOOTPROTO=dhcp/BOOTPROTO=none/' /etc/sysconfig/network-scripts/ifcfg-em1
sed -i 's/ONBOOT=no/ONBOOT=yes/' /etc/sysconfig/network-scripts/ifcfg-em1

echo "
BROADCAST=192.168.1.255
IPADDR=$IPADDR
NETWORK=192.168.1.0
" >>/etc/sysconfig/network-scripts/ifcfg-em1

#### 2.3 DNS解析
echo "nameserver 202.103.24.68" >/etc/resolv.conf

#### 2.4 测试
chkconfig | grep network
service network restart
ping www.baidu.com

3. 参考

http://blog.51yip.com/linux/1120.html
https://github.com/iofdata/DM/issues/8

Sysbench for MySQL Testing

1. For whole testing scripts

    host=localhost
    port=3306
    socket=/home/data/mysql/mysql.sock
    user=root
    password=123456

    resultsdir=./results-thread

    threads="8 16 32 64 128"

    sizes="1000000 5000000 10000000 15000000 20000000 25000000 30000000"


    printf "sizes,threads,transactions,trns p/s,deadlocks, dls p/s,read/write requests,r/w reqs p/s,min,avg,max,99 percentile \n" >> stat.txt

    mkdir -p $resultsdir

    for thread in $threads;do
        mkdir $resultsdir/thread-$thread
        for size in $sizes; do
            sysbench --test=oltp --mysql-table-engine=innodb \
            --oltp-table-size=$size  --mysql-socket=$socket \
            --mysql-user=$user --mysql-host=$host \
            --mysql-password=$password --mysql-db=students \
            --oltp-table-name=test$size prepare;
            sysbench --test=oltp --mysql-table-engine=innodb \
            --oltp-table-size=$size --mysql-socket=$socket \
            --mysql-user=$user --mysql-host=$host \
            --mysql-password=$password --mysql-db=students \
            --oltp-table-name=test$size  --max-requests=1000 \
            --num-threads=$thread run | \ 
            tee -a $resultsdir/thread-$thread/sysbench.$thread.$size.report;
            sysbench --test=oltp --mysql-host=$host  --mysql-user=$user \
            --mysql-password=$password --mysql-socket=$socket \
            --mysql-db=students --oltp-table-name=test$size  cleanup;

            cat $resultsdir/thread-$thread/sysbench.$thread.$size.report | \
            egrep "cat|threads:|transactions:|deadlocks|
            read/write|min:|avg:|max:|percentile:" | \
            sed  -e '1 s/Number of threads: //' | \
            tr -d "\n" | \
            sed -e 's/Number of threads: /\n/g' \
            -e 's/[A-Za-z\/]\{1,\}://g' \
            -e 's/read\/write//g' \
            -e 's/approx\.  95//g' \
            -e 's/per sec.)//g' \
            -e 's/ms//g' \
            -e 's/(//g'  \
            -e 's/  */,/g' | awk -v d=$size '{$0=d","$0}1' >> stat.txt
        done
    done

单文库基因组组装 (A Single Library for Genome Assemble)

Illumina 报告中比较了 Reads 长度,coverage,insert size 等对组装结果的影响,可以看到理想状况下,对于简单基因组,30X左右短片段reads加上适量长片段reads可以覆盖足够的基因组区域,并且有较好的N50等指标。

最开始sanger测序可能为了避免重复序列的影响,采用了1k-40k的建库策略,后来soapdenovo在做人类基因组的时候沿用了200,500,2k,5k,10k的测序方法。但是不同基因组具体采用的策略并不一致,但是一般均需要短片段文库(<2k)和长片段文库(>2k)。像Abyss由于做非洲人的时候就只用了42X的210文库数据。

GAGE评价了一些组装软件的组装效果,有 Effect of multiple libraries on assembly 这一段。结合我自己的项目经验,multilib的策略是为了辅助scaffolding。因为contig的组装主要用到reads见的overlap信息,只要测序随机和均一,并且深度足够,短片段reads可以很好的组装出contig(无N的一致性序列),contig的组装步骤并不设计文库片段信息(insert-size和pair-end关系),后面scaffolding则需要用到文库信息来辅助contig间建立连接关系,而这里最主要的也是需要大雨2k的文库梯度分配。所以像allpath这种软件推荐的就是一个短片段文库加一个大片段文库。金小峰这种单倍体物种,基因组也不太大,考虑到个体小,提取DNA复杂,一只蜜蜂样品不足以构建三个短片段文库(200,500,800),我们可以尽量尝试建1到2个文库,对于contig组装影响不会太大(我曾经组装的单染色体蚂蚁也是由于样品原因,建了一个500的文库,效果也很好)。

另外我们注意到像fermi这样的最新的组装软件的进展,对人类基因组已经可以一个样品一个库,35X数据做denovo assembly了。

为了更好的开展后续的分析和讨论工作,后面我还会具体找下已经出来的蜜蜂或蚂蚁的组装文献给大家看看,应该说膜翅目的研究现在还是比较热门的,有很多可参考的借鉴的地方。为了尽快推进这个项目,我们没必要非建3个文库。这是我的意见。

Bayesian Genome Assembly and MCMC Assessment

Introduction

They first build an assembly graph starting from a de Bruijn graph of the reads. Then they remove all tips and merge all unambiguous paths into single nodes that are annotated by the sequence of merged K-mers.

The resulting unresolved assembly graph (no longer de Bruijn) is a directed graph that consists only of bubbles and is a minimal representation of the variants that can be inferred from the sequenced data. Concatenating the sequences across the nodes in a particular path through this graph gives a possible assembly sequence.