2016年2月24日 星期三

OK Google ! - 產品與專利

個人的使用經驗覺得,Google的語音辨識能力比Apple的好,用手機語音輸入,常常都可以講完後不用校正就直接送出。若以語音助理來看,我的操作是,按住耳機上的「通話鍵」,手機就會要求你語音輸入...,結果...看手機而定。





Android版的Siri(或許Google不喜歡如此稱她)相關美國專利如US 8768712(申請號:14/096,359;申請日:12-04-2013),以此案衍生的後續佈局包括(資料來源:PAIR):

14/220,781(申請日:03-20-2014)
14/990,462(申請日:01-07-2016)
14/991,092(申請日:01-08-2016)
PCT/US14/31475(進入PCT,申請日:03-21-2014)

有PCT案,顯然Google有意佈局更多國家,以最早申請日12-04-2013為準,PCT優先權到期日為06-04-2016。

US 8768712揭露根據部分熱字啟始動作的技術,主要技術如Claim 1揭露,接收語音數據,判斷當中對應到熱字的初始部分(initial portion of a hotword),顯然就是要找到語音數據中的關鍵(或是特徵),並以此啟始一或多個動作,並執行出來。

1. A computer-implemented method comprising:
receiving audio data;
determining that an initial portion of the audio data corresponds to an initial portion of a hotword;
in response to determining that the initial portion of the audio data corresponds to the initial portion of the hotword, selecting, by one or more computers, from among a set of one or more actions that are performed when the entire hotword is detected, a subset of the one or more actions; and
causing one or more actions of the subset to be performed.
專利範圍的規劃如常見的軟體專利佈局:
18. A system comprising: 
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: 
receiving audio data; 
determining that an initial portion of the audio data corresponds to an initial portion of a hotword; 
in response to determining that the initial portion of the audio data corresponds to the initial portion of the hotword, selecting, from among a set of one or more actions that are performed when the entire hotword is detected, a subset of the one or more actions; and 
causing one or more actions of the subset to be performed. 

19. A computer-readable storage device storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising: 
receiving audio data; 
determining that an initial portion of the audio data corresponds to an initial portion of a hotword; 
in response to determining that the initial portion of the audio data corresponds to the initial portion of the hotword, selecting, from among a set of one or more actions that are performed when the entire hotword is detected, a subset of the one or more actions; and 
causing one or more actions of the subset to be performed. 

其說明書"傳神"地描述如何啟始這個語音服務:
"In some examples, the user 116 says one or more words that the mobile computing device 100 detects. In some examples, the utterance includes one or more hotwords, or partial hotwords, that cause an action to be performed by the mobile computing device 100. As depicted in the illustrated example, the user 116 says "OK Google." The mobile computing device 100 detects the utterance "OK Google" with the audio subsystem 102 appropriately receiving audio data of the utterance "OK Google." "





另一延續案佈局(未核准):14/220,781(申請日:03-20-2014)




前案:
Google語音柱裡的專利在PCT案檢索報告可以看到一件Nokia在2010年申請的先前技術:WO/2012/025784,方法包括:轉換語音頻域訊號為電壓訊號、判斷特徵、比較語音觸發指令的特徵,以及根據比對結果啟始語音使用者介面。


Claims

1 . A method comprising;
converting an audio frequency domain signal into one or more voltage signals;
determining the characteristics of the one or more voltage signals;
comparing the characteristics of the one or more voltage signals with one or more characteristics of an audio trigger command; and
initiating activation of an audio user interface on the basis of the comparison.

Ron

沒有留言: