代码行统计工具-CLOC

在工作中有时会有需要统计代码的行数,一般会用 wc 给出一个大致的结果。只不过在源代码文件分布比较分散,且存在多种不同类型语言的源代码时,wc 就不是特别适合了。

在公司内部也见过一些同事实现类似功能的脚本,但我想这应该是一个通用的需求,于是就找到了这个工具 - CLOC。其实就是一个 perl 脚本,很好用,统计报告也很清晰。在这里推荐一下。下面是一个统计 leveldb 源代码行数的例子。

$ cloc .
     128 text files.
     123 unique files.                                          
     353 files ignored.

http://cloc.sourceforge.net v 1.55  T=0.5 s (238.0 files/s, 46718.0 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C++                             60           2012           1258          13124
C/C++ Header                    52            968           1458           2690
HTML                             3             84              0           1094
C                                1             33              7            255
make                             1             43             17            153
CSS                              1             10              1             78
Bourne Shell                     1              9             19             46
-------------------------------------------------------------------------------
SUM:                           119           3159           2760          17440
-------------------------------------------------------------------------------

WordPress博客评论合并工具

上篇,这里共享我写的一个用来合并 WordPress 博客评论的小工具。该工具可以将两个镜像 WordPress 博客上对同一篇文章的评论合并起来。

下面先介绍合并的步骤:

1. 首先到这里下载我修改的 WordPress 导入插件,并按照安装一般 WordPress 插件的方式,安装并启用该插件。

2. 然后在 WP 管理后台选择“工具->导入->WordPress”,然后上传从镜像 WP 博客导出的 xml 文件。

3. 在下一步选择“Only Merge Comments” 很重要!!!

Wordpress博客评论合并工具

4. submit,稍等片刻即可。

其实我没有重新制造轮子,只是修改了一下 WordPress 默认的博客导入工具 WordPress Importer,给它加了点儿功能。只要选中“Only Merge Comments”,使用这个工具是很安全的,它只会将 xml 中与当前博客中存在的文章对应的评论添加上去,而不处理任何不存在的文章,也不会重复添加已有的评论,而且会过滤某些垃圾评论。用这个选项,你可以重复导入很多次 :)

可能的缺陷有:这个工具判断文章是否存在的唯一标准是文章标题,因此如果有多篇文章标题一样,可能会存在问题(未测试)。本人不保证它是充分测试的,因此在应用之前最好还是在本地的镜像测试后进行;如果没有进行测试,请一定在合并之前对博客进行备份

下面是我修改的 patch:

--- wordpress-importer/wordpress-importer.php    2010-06-02 00:38:23.000000000 +0800
+++ ../../www/blog/wp-content/plugins/wordpress-importer/wordpress-importer.php    2010-09-29 19:33:57.953790929 +0800
@@ -49,2 +49,3 @@
     var $fetch_attachments = false;
+    var $only_merge_comments = false;
     var $url_remap = array ();
@@ -258,2 +259,7 @@

+<h2><?php _e('Only Merge Comments', 'wordpress-importer'); ?></h2>
+<p>
+    <input type="checkbox" value="1" name="comments" id="merge-comments" />
+    <label for="merge-comments"><?php _e('Only merge comments, ignore post, tags...', 'wordpress-importer') ?></label>
+</p>
<?php
@@ -483,3 +489,7 @@

-        $post_exists = post_exists($post_title, '', $post_date);
+        if ($this->only_merge_comments) {
+            $post_exists = post_exists($post_title, '', '');
+        } else {
+            $post_exists = post_exists($post_title, '', $post_date);
+        }

@@ -489,4 +499,7 @@
             $comment_post_ID = $post_id = $post_exists;
-        } else {
-
+        } else if ( $this->only_merge_comments) {
+            echo '<li>';
+            printf(__('Post <em>%s</em> not found, comments not updated.', 'wordpress-importer'), stripslashes($post_title));
+            $comment_post_ID = $post_id = $post_exists;
+        } else {
             // If it has parent, process parent first.
@@ -605,3 +618,11 @@
                 // if this is a new post we can skip the comment_exists() check
-                if ( !$post_exists || !comment_exists($comment['comment_author'], $comment['comment_date']) ) {
+                if ($this->only_merge_comments) {
+                    if ( $post_exists && !comment_exists($comment['comment_author'], $comment['comment_date']) && $comment['comment_author'] != 'Unknown') {
+                        if (isset($inserted_comments[$comment['comment_parent']]))
+                            $comment['comment_parent'] = $inserted_comments[$comment['comment_parent']];
+                        $comment = wp_filter_comment($comment);
+                        $inserted_comments[$key] = wp_insert_comment($comment);
+                        $num_comments++;
+                    }
+                } else if ( !$post_exists || !comment_exists($comment['comment_author'], $comment['comment_date']) ) {
                     if (isset($inserted_comments[$comment['comment_parent']]))
@@ -847,5 +868,7 @@
         $this->get_entries();
-        $this->process_categories();
-        $this->process_tags();
-        $this->process_terms();
+        if ($this->only_merge_comments) {
+            $this->process_categories();
+            $this->process_tags();
+            $this->process_terms();
+        }
         $result = $this->process_posts();
@@ -891,2 +914,4 @@
                 $fetch_attachments = ! empty( $_POST['attachments'] );
+                $only_merge_comments = ! empty( $_POST['comments'] );
+                $this->only_merge_comments = (bool) $only_merge_comments;
                 $result = $this->import( $_GET['id'], $fetch_attachments);

An IPv6 Enabled NTP Client for Windows in Python

Python NTP library (ntplib) offers a simple interface to query NTP servers from Python. But it does not support IPv6 NTP servers. I wrote a patch for ntplib to support IPv6 connections. You can download the patch file here and the patched library here.

The code bellow is a simple IPv6 enabled NTP client (ntpdate.py) in Python for Windows, using the patched ntplib. It doesn't (and won't) support Linux because the official NTP release offers IPv6 support on that platform.

#!/usr/bin/env python
# ntpdate.py - set the date and time via NTP
# An IPv6 enabled ntp client, for Windows ONLY.

import ntplib, time
from os import system
from sys import argv

def usage():
  print '''Usage: ntpdate.py  [-qh] server
Example:
  ntpdate.py 210.72.145.44      # IPv4
  ntpdate.py ntp6.remco.org     # IPv6
Options:

  -q     Query only - don't set the clock.
  -h     Print this message.

IPv6 NTP Server List:
  ntp6.remco.org               [2001:888:1031::2]
  ntp6.space.net               [2001:608:0:dff::2]
  time.buptnet.edu.cn          [2001:da8:202:10::60]
  time.join.uni-muenster.de    [2001:638:500:717:2e0:4bff:fe04:bc5f]
  ntp.sixxs.net                [2001:1291:2::b]
  ntp.eu.sixxs.net             [2001:808::66]
  ntp.us.sixxs.net             [2001:1291:2::b]
  ntp.rhrk.uni-kl.de           [2001:638:208:9::116]
  ntp.ipv6.uni-leipzig.de      [2001:638:902:1::10]
  ntp.hexago.com               [2001:5c0:0:2::25]
  ntp1.bit.nl                  [2001:7b8:3:2c::123]

Report bugs to http://solrex.org.'''
  sys.exit()

def main():
  ntp_svr = ''
  query = False

  for a in argv[1:]:
    if a == '-q':
      query = True
    elif a == '-h':
      usage()
    else:
      ntp_svr = a
  if ntp_svr == '':
    usage()

  c = ntplib.NTPClient()
  res = c.request(ntp_svr, version=3)
  t_epoch = res.offset + res.delay + time.time()
  t = time.localtime(t_epoch)
  centi_sec = t_epoch%1 * 100
  time_str = time.strftime('%H:%M:%S', t)
  if not query:
    system('time %s.%2.0f' % (time_str, centi_sec))
    date_str = time.strftime('%Y-%m-%d', t)
    system('date %s' % date_str)
  if query:
    print 'server %s, stratum %d, offset %f, delay %f' % (
           ntp_svr, res.stratum, res.offset, res.delay)
  print '%s %s ntpdate.py: time server %s offset %f sec' % (
         time.strftime('%d %b', t), time_str, ntp_svr, res.offset)

if __name__ == '__main__':
  main()

一个 Windows 对时小工具

由于在 CERNET 内,我经常需要用代理上网,没办法直连到 NTP 服务器,因此不能使用 Windows 时间服务对时。偶尔维修电脑或者不小心调整错时间,再加上电脑时钟本身就有一定的漂移,对时就变成了件麻烦的事情。

手动调时也没个参照,误差往往比较大。IPv6 网络上存在一些 NTP 服务器,Linux 下有 ntpdate 是支持 IPv6 NTP 服务器的,但是我搜索了半天,才在一篇文章上看到有人评论说 Windows 下只有一款 NTP 客户端支持 IPv6,还是收费软件——可他也没给出名字。

无奈之下想到 Python 的 httplib 是支持 IPv6 连接的,于是我就仿照 htpdate 写了一个利用 Google 的 IPv6 Web 服务器进行对时的 Python 小工具 htpdate.py。虽然误差比 NTP 大不少,但是还是在可接受范围内(不到 1 秒),而且比较方便,连日期也一块更新了。下面是代码,比较粗糙。

#!/usr/bin/env python
import httplib, time
from os import system

def main():
  conn = httplib.HTTPConnection('google.com')
  time.clock()
  conn.request('HEAD', '')
  t_rtt = time.clock()
  res_time = conn.getresponse().getheader('date')
  t = time.localtime(time.mktime(time.strptime(res_time,
                                 '%a, %d %b %Y %H:%M:%S %Z')) - time.timezone)
  time_str = time.strftime('%H:%M:%S', t)
  local_time = time.asctime()
  t_exe = time.clock()
  centi_sec = (t_exe - t_rtt/2)*100
  if centi_sec > 99:
    centi_sec = 99
  system('time %s.%2.0f' % (time_str, centi_sec))
  date_str = time.strftime('%Y-%m-%d', t)
  system('date %s' % date_str)
  print 'LOCAL  TIME: ' + local_time
  print 'SERVER TIME: ' + time.asctime(t)
  print 'LOCAL  TIME: ' + time.asctime()
  if (t_exe - t_rtt/2) >= 1:
    print 'Round trip time is too long. Time error might be larger than 1 sec.'

if __name__ == '__main__':
  main()