关于sql:md5 in bigquery

md5 in bigquery

在BigQuery中,我使用MD5函数作为:

1
SELECT md5('<<some string>>') AS hashed

它总是在字母的最后一个字母中返回"==",例如:

1
R7zlx09Yn0hn29V+nKn4CA==

为什么"=="总是和它一起出现?


当MD5返回BYTES并且需要字符串时,需要使用to_hex来获得所需的表示:

TO_HEX: Converts a sequence of BYTES into a hexadecimal STRING.
Converts each byte in the STRING as two hexadecimal characters in the
range (0..9, a..f).

1
SELECT TO_HEX(md5('123456')) AS hashed

返回:

1
e10adc3949ba59abbe56e057f20f883e

=是由于base64的填充。然而,根据文档,输出应该是字节,而输出是base64字符串。您可以通过以下查询来检查:

1
SELECT MD5("Hello World") AS MD5,TO_HEX(MD5("Hello World")) AS BYTES,TO_BASE64(FROM_HEX(TO_HEX(MD5("Hello World")))) AS BASE64

具有以下输出:

1
2
ROW |MD5                        |BYTES                              |BASE64  
1   |sQqNsWTgdUEFt6mb5y4/5Q==   |b10a8db164e0754105b7a99be72e3fe5   |sQqNsWTgdUEFt6mb5y4/5Q=