Google App API Protocol - Voice Search

Google 在移动平台(Android 和 iOS)上提供了独立的 Search App,但它不仅仅是用一个移动浏览器封装了 Google Web Search,而是做了很多移动应用相关的改进。这个系列文章,通过抓包对 Android Google App 与 Server 间通讯协议进行简单分析,管中窥豹,以见一斑。

  1. Google App API Protocol - Text Search
  2. Google App API Protocol - Voice Search
  3. Google App API Protocol - Search History

语音搜索

语音搜索与文本搜索不同,必须先进行语音识别(语音转文字),拿到识别结果后才能进行搜索,拿到搜索结果。语音识别过程是一个录音数据流上传过程,一般采取的都是流式分片上传。Google App 的语音搜索,仍然是沿袭 Google 一贯注重效率的风格,使用两个 HTTPS 请求,完成了录音数据的流式上传,以及识别结果、搜索结果和语音播报的流式下发。具体的做法是:

在用户发起语音请求时,Google App 与 Google 服务器同时建立两个 HTTPS POST 连接,以 URL 参数 "pair" 标识这是同一用户的一对连接:

https://www.google.com/m/voice-search/up?pair=4fc4d987-3f06-49d5-9f3a-986937a5e5fe
https://www.google.com/m/voice-search/down?pair=4fc4d987-3f06-49d5-9f3a-986937a5e5fe

Google App 基于 chunked 编码,在 up 连接中实现录音数据流式上传,在 down 连接中实现识别结果、搜索结果和语音播报的流式下发。

语音搜索 up 连接

在 up 连接中,Google App 仅利用了 POST 请求通路的 chunked 编码流式上传录音数据,直到所有数据上传完成,Google 服务器才会返回一个 Content-Length 为 0 的 200 响应消息。而且 POST 请求的 HTTP Header/Params 很简单,仅仅包含基本的信息。

Host	www.google.com
Connection	keep-alive
Transfer-Encoding	chunked
Cache-Control	no-cache, no-store
X-S3-Send-Compressible	1
User-Agent	Mozilla/5.0 (Linux; Android 4.4.4; HM 1S Build/KTU84P)
                AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/33.0.0.0 
                Mobile Safari/537.36 GSA/5.9.33.19.arm
Content-Type	application/octet-stream
Accept-Encoding	gzip, deflate

这里要注意 Cotent-Type 是 application/octet-stream,这代表上传的是二进制数据流,需要用专门的协议才能 decode 到真正内容。这给解析协议带来了很大的困难。简单观察二进制数据流,发现部分内容是遵从 protobuf 协议格式,但直接使用 application/x-protobuffer 的 Dilimited List 封包格式进行 docode 并不成功。无奈之下,只好采取比较笨的方法,观察二进制数据流,跳过那些明显不符合 protobuf 协议格式的部分,尽最大努力把 protobuf 消息先解包出来。然后再根据已经解包出来的部分,逐步推测未解析部分的编码格式。经过较长时间的分析,拿到了 Google App 语音搜索 up 连接的请求消息封包格式大致如下:

(10 bytes Header)
(Big Endian Fixed32 of Msg Len)(Msg)
(Big Endian Fixed32 of Msg Len)(Msg)
......
(Big Endian Fixed32 of Msg Len)(Msg)

对于 Header 长度是否恒为 10 字节,我持怀疑态度,有待通过更多分析发现规律。做了很多二进制码的分析工作,到最后才发现消息结构如此简单,让人不得不抱怨:Google 你直接用和文本搜索一样的 application/x-protobuffer 格式不得了吗?还搞得那么复杂!不过也许有一定历史原因吧。

和文本搜索一样,先建立一个 Empty protobuf Message,逐步去推导流式协议里 Message 的结构,尽最大猜测推导出来的 Google 语音搜索请求消息的 .proto 可能是这样的:

$ more voicesearch.proto
package com.google.search.app;

message VoiceSearchRequest {
    optional int32 fin_stream = 3;
    optional UserInfo user_info = 293000;
    optional VoiceSampling voice_sampling = 293100;
    optional VoiceData voice_data = 293101;
    optional ClientInfo client_info = 294000;
    optional UserPreference user_preference = 294500;
    optional Empty VSR27301014 = 27301014;
    optional Empty VSR27423252 = 27423252;
    optional Double VSR27801516 = 27801516;
    optional GetMethod get_method = 34352150;
    optional Empty VSR61914024 = 61914024;
    optional Empty VSR77499489 = 77499489;
    optional Empty VSR82185720 = 82185720;
}

message UserInfo {
    optional Empty lang = 2;
    optional Empty locale = 3;
    optional string uid = 5;
    optional GoogleNow google_now = 9;
}

message GoogleNow {
	optional string auth_url = 1;
	optional string auth_key = 2;
}

message VoiceSampling {
    optional float sample_rate = 2;
}

message VoiceData {
    optional bytes amr_stream = 1;
}

message ClientInfo {
	optional string type = 2;
	optional string user_agent = 4;
	repeated string expids = 5;
	optional string os = 8;
	optional string model = 9;
	optional string brand = 11;
}

message UserPreference {
    optional Favorites favorites = 1;
    optional Empty UP4 = 4;
    optional Empty UP25 = 25;
}

message Favorites {
	optional string lang = 9;
	repeated Empty favorites_data = 22;
}

message GetMethod {
	optional Empty params = 1;
	optional HttpHeader headers = 2;
	optional string path = 3;
}

message HttpHeader {
    repeated string name = 1;
    repeated string text_value = 2;
    repeated bytes binary_value = 4;
}

基于这个 .proto,对语音输入『北京天气』的 POST 请求消息进行解码,最终的结果摘要如下:

$ more up.txt
######## Data block info: offset=0xa blockSize=3717
#### Proto: Voicesearch.VoiceSearchRequest Message with Size=3654
user_info {
  lang {
    1: "cmn-Hans-CN"
    2: 1
  }
  locale {
    1: "zh_CN"
    2: 2
  }
  uid: "******"
  google_now {
    auth_url: "https://www.googleapis.com/auth/googlenow"
    auth_key: "******"
    7: 1
  }
  8: "w "
}
voice_sampling {
  sample_rate: 16000.0
  3: 9
}
client_info {
  type: "voice-search"
  user_agent: "Mozilla/5.0 (Linux; Android 4.4.4; ......"
  expids: "p2016_01_27_20_58_17"
  expids: "8501679"
  expids: "8501680"
  expids: "8502094"
  expids: "8502157"
  expids: "8502159"
  expids: "8502312"
  expids: "8502347"
  expids: "8502369"
  expids: "8502376"
  expids: "8502490"
  expids: "8502491"
  expids: "8502618"
  expids: "8502679"
  expids: "8502705"
  expids: "8502947"
  expids: "8502986"
  expids: "8503012"
  expids: "8503037"
  expids: "8503109"
  expids: "8503110"
  expids: "8503132"
  expids: "8503133"
  expids: "8503157"
  expids: "8503158"
  expids: "8503208"
  expids: "8503212"
  expids: "8503214"
  expids: "8503303"
  expids: "8503306"
  expids: "8503367"
  expids: "8503368"
  expids: "8503404"
  expids: "8503512"
  expids: "8503559"
  expids: "8503585"
  expids: "8503604"
  expids: "8503606"
  expids: "8503729"
  expids: "8503730"
  expids: "8503751"
  expids: "8503752"
  expids: "8503755"
  expids: "8503805"
  expids: "8503815"
  expids: "8503832"
  expids: "8503835"
  expids: "8503844"
  expids: "8503853"
  expids: "8503855"
  expids: "8503907"
  expids: "8503925"
  expids: "8503927"
  expids: "8503994"
  expids: "8504022"
  expids: "8504059"
  os: "Android"
  model: "KTU84P"
  brand: "HM 1S"
  1: ""
  10: "300601416"
  12: 720
  13: 1280
  14: 320
  17: "assistant-query-entry"
}
user_preference {
  favorites {
    favorites_data {
      1: 2
      2: "fresh"
      9: "cmn-Hans-CN"
    }
    favorites_data {
      1: 2
      2: "favorite-phone"
      9: "cmn-Hans-CN"
    }
    favorites_data {
      1: 2
      2: "favorite-email"
      9: "cmn-Hans-CN"
    }
    favorites_data {
      1: 2
      2: "favorite-person"
      9: "cmn-Hans-CN"
    }
  }
  UP4 {
    1: 10
    2: 250
    3: 1
  }
  UP25 {
    1: 0
  }
  3: 5
  5: 1
  7: 0
  13: 1
  14: 1
  20: 1
  22: 0
}
VSR27301014 {
  1: 460
  2: 0
  3: 460
  4: 0
  5: 1
}
VSR27423252 {
  1: "rbshUd2M8t4"
}
VSR27801516 {
  d7: 0.6037593483924866
}
get_method {
  params {
    1: "noj"
    1: "tch"
    1: "spknlang"
    1: "ar"
    1: "br"
    1: "ttsm"
    1: "client"
    1: "hl"
    1: "oe"
    1: "safe"
    1: "gcc"
    1: "ctzn"
    1: "ctf"
    1: "v"
    1: "padt"
    1: "padb"
    1: "ntyp"
    1: "ram_mb"
    1: "qsubts"
    1: "wf"
    1: "inm"
    1: "source"
    1: "entrypoint"
    1: "action"
    2: "1"
    2: "6"
    2: "cmn-Hans-CN"
    2: "0"
    2: "0"
    2: "default"
    2: "ms-android-xiaomi"
    2: "zh-CN"
    2: "utf-8"
    2: "images"
    2: "cn"
    2: "Asia/Shanghai"
    2: "1"
    2: "5.9.33.19.arm"
    2: "200"
    2: "640"
    2: "1"
    2: "870"
    2: "1458289692542"
    2: "pp1"
    2: "vs-asst"
    2: "and/assist"
    2: "android-assistant-query-entry"
    2: "devloc"
  }
  headers {
    name: "User-Agent"
    name: "X-Speech-RequestId"
    name: "Cookie"
    name: "Host"
    name: "Date"
    name: "X-Client-Instance-Id"
    name: "X-Geo"
    name: "X-Client-Opt-In-Context"
    name: "X-Client-Data"
    text_value: "Mozilla/5.0 (Linux; Android 4.4.4; ......"
    text_value: "rbshUd2M8t4"
    text_value: "******"
    text_value: "www.google.com.hk"
    text_value: "Fri, 18 Mar 2016 08:28:13 GMT"
    text_value: "c6aef75ce70631e9518ae8e7011bb3d7957b24d59b62fd1b9d378262b3690cff"
    text_value: "w CAEQDKIBBTk6MTox"
    binary_value: "\037\213\b\000......"
    binary_value: "\b\257\363\206......gsa"
  }
  path: "/search"
  5: 0
}
VSR61914024 {
  1: 1
}
VSR77499489 {
  1: 1
}
VSR82185720 {
  1: "\b\001"
}
1: "voicesearch-web"
2: 1
4: 0

######## Data block info: offset=0xe93 blockSize=39
#### Proto: Voicesearch.VoiceSearchRequest Message with Size=39
get_method {
  headers {
    name: "X-Geo"
    text_value: "w CAEQDKIBBTE6MTox"
    3: 2
  }
}
2: 1

######## Data block info: offset=0xebe blockSize=309
#### Proto: Voicesearch.VoiceSearchRequest Message with Size=309
voice_data {
  amr_stream: "#!AMR-WB......"
}
2: 1

######## Data block info: offset=0xff7 blockSize=309
#### Proto: Voicesearch.VoiceSearchRequest Message with Size=309
voice_data {
  amr_stream: "......"
}
2: 1

......

######## Data block info: offset=0x4394 blockSize=309
#### Proto: Voicesearch.VoiceSearchRequest Message with Size=309
voice_data {
  amr_stream: "......"
}
2: 1

######## Data block info: offset=0x44cd blockSize=267
#### Proto: Voicesearch.VoiceSearchRequest Message with Size=267
voice_data {
  amr_stream: "......"
}
2: 1

######## Data block info: offset=0x45dc blockSize=4
#### Proto: Voicesearch.VoiceSearchRequest Message with Size=4
fin_stream: 1
2: 1

语音搜索 down 连接

在 down 连接中,Google App 仅利用了 POST 响应通路的 chunked 编码流式下发识别结果、搜索结果和语音播报数据。App 发起的虽然是一个 POST 请求,但在请求中并没有任何 POST Data,Content-Length 为 0。响应消息的 header 如下:

Content-Type	application/vnd.google.octet-stream-compressible
Content-Disposition	attachment
Cache-Control	no-transform
X-Content-Type-Options	nosniff
Pragma	no-cache
Content-Encoding	gzip
Date	Fri, 18 Mar 2016 08:28:14 GMT
Server	S3 v1.0
X-XSS-Protection	1; mode=block
X-Frame-Options	SAMEORIGIN
Alternate-Protocol	443:quic,p=1
Alt-Svc	quic=":443"; ma=2592000; v="31,30,29,28,27,26,25"
Transfer-Encoding	chunked

这里值得注意的,仍然是 Cotent-Type,这次是 application/vnd.google.octet-stream-compressible,又一个二进制私有协议数据流。通过与 up 流分析类似的方法,发现 Google App 语音搜索 down 连接的响应消息封包格式大致如下:

(4 bytes Header)
(Big Endian Fixed32 of Msg Len)(Msg)
(Big Endian Fixed32 of Msg Len)(Msg)
......
(Big Endian Fixed32 of Msg Len)(Msg)

可以看到,除了 Header 长度不同以外,基本与 up 连接的请求消息封包格式一样。采取同样的方式推测到 Google 语音搜索响应消息的 .proto 可能是这样的:

$ more voicesearch.proto
package com.google.search.app;
message VoiceSearchResponse {
    optional int32 fin_stream = 1;
    optional RecogBlock recog_block = 1253625;
    optional SearchResult search_result = 39442181;
    optional TtsSound tts_sound = 28599812;
}

message RecogBlock {
    optional RecogResult recog_result = 1;
    optional VoiceRecording voice_recording = 2;
    optional string inputLang = 3;
    optional string searchLang = 4;
}

message RecogResult {
    optional CandidateResults can = 3;
    repeated RecogSegment recog_segment = 4;
    optional CandidateResults embededi15 = 5;
}

message VoiceRecording {
    optional int32 record_interval = 3;
}

message RecogSegment {
    optional DisplayResult display = 1;
    optional int32 seg_time = 3;
}

message DisplayResult {
    optional string query = 1;
    optional double prob = 2;
}

message CandidateResults {
    optional CandidateResult can3 = 3;
    optional Empty can4 = 4;
}

message CandidateResult {
    repeated string query = 1; 
    repeated string queryWords = 12; 
    optional float res2 = 2;
    repeated CandidateInfo res7 = 7;
}

message CandidateInfo {
    optional CandidateInfoMore inf1 = 1;
}

message CandidateInfoMore {
    optional CandidateInfoMoreDetial more1 = 1;
    optional string more3 = 3;
}

message CandidateInfoMoreDetial {
    optional string type = 1;
    optional string detial2 = 2;
    optional float detial3 = 3;
    optional CandidateInfoMoreDetialSnip detial7 = 7;
    optional string detial8 = 8;
}

message CandidateInfoMoreDetialSnip {
    optional string query = 1;
}

message SearchResult {
    optional string header = 1;
    optional bytes bodyBytes = 3;
}

message TtsSound {
    optional bytes sound_data = 1;
    optional int32 fin_stream = 2;
    optional Empty code_rate = 3;
}

message Empty {
}

基于这个 .proto,对语音输入『北京天气』的 down 响应消息进行解码,最终的结果摘要如下:

$ more down.txt
######## Raw Data block info: offset=0x4 size=41
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=41
recog_block {
  voice_recording {
    record_interval: 140000
    1: 0
    2: 0
  }
  inputLang: "cmn-hans-cn"
  searchLang: "cmn-Hans-CN"
}

######## Raw Data block info: offset=0x31 size=61
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=61
recog_block {
  recog_result {
    recog_segment {
      display {
        query: "北"
        prob: 0.01
      }
      seg_time: 1500000
      2: 0
    }
    1: 0
    2: 0
  }
  inputLang: "cmn-hans-cn"
  searchLang: "cmn-Hans-CN"
}

######## Raw Data block info: offset=0x72 size=64
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=64
recog_block {
  recog_result {
    recog_segment {
      display {
        query: "北京"
        prob: 0.01
      }
      seg_time: 1560000
      2: 0
    }
    1: 0
    2: 0
  }
  inputLang: "cmn-hans-cn"
  searchLang: "cmn-Hans-CN"
}

######## Raw Data block info: offset=0xb6 size=67
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=67
recog_block {
  recog_result {
    recog_segment {
      display {
        query: "北京天"
        prob: 0.01
      }
      seg_time: 1980000
      2: 0
    }
    1: 0
    2: 0
  }
  inputLang: "cmn-hans-cn"
  searchLang: "cmn-Hans-CN"
}

######## Raw Data block info: offset=0xfd size=71
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=71
recog_block {
  recog_result {
    recog_segment {
      display {
        query: "北京天气"
        prob: 0.01
      }
      seg_time: 2100000
      2: 0
    }
    1: 0
    2: 0
  }
  inputLang: "cmn-hans-cn"
  searchLang: "cmn-Hans-CN"
}

######## Raw Data block info: offset=0x148 size=71
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=71
recog_block {
  recog_result {
    recog_segment {
      display {
        query: "北京天气"
        prob: 0.9
      }
      seg_time: 2700000
      2: 0
    }
    1: 0
    2: 0
  }
  inputLang: "cmn-hans-cn"
  searchLang: "cmn-Hans-CN"
}

######## Raw Data block info: offset=0x193 size=14
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=14
recog_block {
  voice_recording {
    1: 1
    2: 2100000
  }
}

######## Raw Data block info: offset=0x1a5 size=40
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=40
recog_block {
  voice_recording {
    1: 2
    2: 3570000
  }
  inputLang: "cmn-hans-cn"
  searchLang: "cmn-Hans-CN"
}

######## Raw Data block info: offset=0x1d1 size=491
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=484
recog_block {
  recog_result {
    can {
      can3 {
        query: "北京天气"
        query: "北京天気"
        query: "北京天器"
        res2: 0.9156357
        queryWords: "北 京 天 气"
        queryWords: "北 京 天 気"
        queryWords: "北 京 天 器"
        6: "\020\n\030\001"
      }
      1: 0
      2: 3570000
    }
    embededi15 {
      can3 {
        query: "北京天气"
        query: "北京天気"
        query: "北京天器"
        res2: 0.9156357
        res7 {
          inf1 {
            more1 {
              type: "literal"
              detial2: "北京天气"
              detial3: 1.0
              detial7 {
                query: "北京天气"
                2: 0
                3: 75
              }
              detial8: "北 京 天 气"
            }
            more3: ""
          }
        }
        res7 {
          inf1 {
            more1 {
              type: "literal"
              detial2: "北京天気"
              detial3: 1.0
              detial7 {
                query: "北京天気"
                2: 0
                3: 75
              }
              detial8: "北 京 天 気"
            }
            more3: ""
          }
        }
        res7 {
          inf1 {
            more1 {
              type: "literal"
              detial2: "北京天器"
              detial3: 1.0
              detial7 {
                query: "北京天器"
                2: 0
                3: 75
              }
              detial8: "北 京 天 器"
            }
            more3: ""
          }
        }
        queryWords: "北 京 天 气"
        queryWords: "北 京 天 気"
        queryWords: "北 京 天 器"
      }
      1: 0
      2: 3570000
    }
    1: 1
    2: 0
  }
  inputLang: "cmn-hans-cn"
  searchLang: "cmn-Hans-CN"
}

######## Raw Data block info: offset=0x3c0 size=29759
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=29759
search_result {
  header: "HTTP/1.1 200 OK
  Content-Type: application/x-protobuffer
  Date: Fri, 18 Mar 2016 08:28:17 GMT
  Expires: -1
  Cache-Control: no-store
  Set-Cookie: ******
  Trailer: X-Google-GFE-Current-Request-Cost-From-GWS
  Set-Cookie: ******
  Content-Disposition: attachment; filename=\"f.txt\"
  
  "
  bodyBytes: "\301R\n\026IbzrVsTgFsK0jwP1iY34CQ .... charset=UTF-8H\001"
  2: 1
  4: 0
}
## Bodybytes Proto: Textsearch.SearchResponse with Size=10561
search_id: "IbzrVsTgFsK0jwP1iY34CQ"
msg_type: 97000
result {
  fin_stream: 0
  text_data: "北京天气"
  html_data: "<!doctype html><html ......</script>"
  encoding: "text/html; charset=UTF-8"
  9: 1
}
## Bodybytes Proto: Textsearch.SearchResponse with Size=16951
search_id: "IbzrVsTgFsK0jwP1iY34CQ"
result {
  fin_stream: 0
  html_data: "<style data-jiis=\"cc\" ......</script>"
  encoding: "text/html; charset=UTF-8"
  9: 1
}
## Bodybytes Proto: Textsearch.SearchResponse with Size=1511
......

######## Raw Data block info: offset=0x7803 size=81
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=81
search_result {
  bodyBytes: "D\n\026IbzrVsTgFsK0jwP1iY34CQ\......"
  2: 1
  4: 0
}
## Bodybytes Proto: Textsearch.SearchResponse with Size=68
......

######## Raw Data block info: offset=0x7858 size=107685
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=107685
......

## Bodybytes Proto: Textsearch.SearchResponse with Size=103014
......
## Bodybytes Proto: Textsearch.SearchResponse with Size=2454
......
## Bodybytes Proto: Textsearch.SearchResponse with Size=2194
......

######## Raw Data block info: offset=0x21d01 size=8480
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=8480
......
## Bodybytes Proto: Textsearch.SearchResponse with Size=7260
......
## Bodybytes Proto: Textsearch.SearchResponse with Size=1202
......

######## Raw Data block info: offset=0x23e25 size=1615
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=1615
tts_sound {
  sound_data: "\377\363@\304\......"
  code_rate {
    1: 22050
  }
}

######## Raw Data block info: offset=0x24478 size=1609
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=1609
tts_sound {
  sound_data: "k\224ed4= \213......"
}

######## Raw Data block info: offset=0x24ac5 size=1609
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=1609
......

######## Raw Data block info: offset=0x25112 size=1609
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=1609
......

######## Raw Data block info: offset=0x2575f size=1609
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=1609
......

######## Raw Data block info: offset=0x25dac size=1518
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=1518
......

######## Raw Data block info: offset=0x2639e size=7
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=7
tts_sound {
  fin_stream: 1
}

######## Raw Data block info: offset=0x263a9 size=2
#### Proto: Voicesearch.VoiceSearchResponse Message with Size=2
fin_stream: 1

协议分析和启发

针对以上解码结果,对 Google App 语音搜索协议分析如下:

双向流式通信模式

上传音频采用流式方式,这个并不难想,但是识别结果、搜索结果和语音播报音频流全部在一个流里回传,需要做很多复杂的架构升级工作,带来的收益也是很明显的:

  1. 识别结果实时上屏。语音识别在解码时,需要收集到足够的语音流之后,才能识别出来文字,也就是说文字结果的出现时间比较随机。流式的识别结果下发能实现一旦有识别中间结果,就可以用最快的速度下发到 App 端展示给用户。
  2. 减少一次搜索请求开销。从逻辑上讲,应该先进行语音识别,再用识别出的 Query 发起搜索。但 Google 把这一步放在了服务端,不需要用户再发起一次搜索结果的 GET 请求。因为 Google Search 的域名和 locale 有关,新的 GET 请求可能需要发给 google.com.hk,这就需要 App 新建一个 HTTPS 连接,开销还是比较大的。而且服务器端还可以进一步优化,比如在识别出中间结果的同时请求即时搜索,不知道 Google 有没有做。
  3. 减少一次语音播报请求开销。原理同上
  4. 提升了 TCP 信道利用率。在同等传输数据量下,减少网络连接数,受 TCP 拥塞控制策略影响,TCP 信道的传输性能能得到一定提升。

录音小块回传

从 up 解码结果来看,Google App 音频流的是以固定每 300 字节一个封包,基于 AMR-WB 压缩这基本相当于 100ms 左右的录音。说明录音数据的上传还是很频繁的,这也能够让服务端尽早地识别出中间结果。还有就是,300字节的设计比较容易 fit in 1500 左右的以太网 MTU、576 的 3G、4G MTU。

搜索结果内置

搜索结果内置,其实相当于在服务器端『代理』App 发起一次内网搜索请求,那服务器端必须要知道正常的 App 请求参数是怎样的。所以在 App 端发起语音请求时,首先向服务器端传送了一些 App 端的配置信息,比如语言、地域、终端类型、用户偏好、搜索参数、搜索 Header 等。有了这些信息,服务器端就可以伪装成 App,通过内网发起搜索请求,获取搜索结果然后把结果内置到语音搜索结果里。但注意到中文语音搜索和文本搜索的域名不同,分别是 google.com 和 google.com.hk,在机房部署上如何高效地实现全球机房的高效内网访问,这仍然是个架构难题,这里无法窥知答案。

对文本搜索结果的分析我们了解到,Google App 的搜索结果是用 Protobuf Message 数组的方式下传的。但语音搜索并没有使用与文本搜索相同的数组元素 Message,所以文本搜索结果是以 bytes 方式,将序列化后的文本搜索 Message 放到语音搜索 Message 中的一个字段中。这也许是为了协议解耦,避免单方面协议改动带来的同步问题。但可能是基于效率考虑,多个文本搜索 Message 可能会被放到同一个语音搜索 Message 中,也是以 Dilimited List 的方式序列化后放入。语音搜索模块解码出来 bytes 之后,可以直接塞给文本搜索的渲染模块渲染,与文本搜索接收的协议格式完全相同。

语音播报

语音播报的音频也是以流式下传,这样可以边播边收,也有效率优势。

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注