Storing UUID as base64 String
我一直在尝试使用UUID作为数据库密钥。 我想占用尽可能少的字节,同时仍使UUID表示易于阅读。
我认为我已经使用base64将其压缩为22个字节,并删除了一些尾随的" ==",这对于我来说似乎是不必要存储的。 这种方法有什么缺陷吗?
基本上,我的测试代码进行了大量转换,以将UUID转换为22字节的字符串,然后将其转换回UUID。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | import java.io.IOException; import java.util.UUID; public class UUIDTest { public static void main(String[] args){ UUID uuid = UUID.randomUUID(); System.out.println("UUID String:" + uuid.toString()); System.out.println("Number of Bytes:" + uuid.toString().getBytes().length); System.out.println(); byte[] uuidArr = asByteArray(uuid); System.out.print("UUID Byte Array:"); for(byte b: uuidArr){ System.out.print(b +""); } System.out.println(); System.out.println("Number of Bytes:" + uuidArr.length); System.out.println(); try { // Convert a byte array to base64 string String s = new sun.misc.BASE64Encoder().encode(uuidArr); System.out.println("UUID Base64 String:" +s); System.out.println("Number of Bytes:" + s.getBytes().length); System.out.println(); String trimmed = s.split("=")[0]; System.out.println("UUID Base64 String Trimmed:" +trimmed); System.out.println("Number of Bytes:" + trimmed.getBytes().length); System.out.println(); // Convert base64 string to a byte array byte[] backArr = new sun.misc.BASE64Decoder().decodeBuffer(trimmed); System.out.print("Back to UUID Byte Array:"); for(byte b: backArr){ System.out.print(b +""); } System.out.println(); System.out.println("Number of Bytes:" + backArr.length); byte[] fixedArr = new byte[16]; for(int i= 0; i<16; i++){ fixedArr[i] = backArr[i]; } System.out.println(); System.out.print("Fixed UUID Byte Array:"); for(byte b: fixedArr){ System.out.print(b +""); } System.out.println(); System.out.println("Number of Bytes:" + fixedArr.length); System.out.println(); UUID newUUID = toUUID(fixedArr); System.out.println("UUID String:" + newUUID.toString()); System.out.println("Number of Bytes:" + newUUID.toString().getBytes().length); System.out.println(); System.out.println("Equal to Start UUID?"+newUUID.equals(uuid)); if(!newUUID.equals(uuid)){ System.exit(0); } } catch (IOException e) { } } public static byte[] asByteArray(UUID uuid) { long msb = uuid.getMostSignificantBits(); long lsb = uuid.getLeastSignificantBits(); byte[] buffer = new byte[16]; for (int i = 0; i < 8; i++) { buffer[i] = (byte) (msb >>> 8 * (7 - i)); } for (int i = 8; i < 16; i++) { buffer[i] = (byte) (lsb >>> 8 * (7 - i)); } return buffer; } public static UUID toUUID(byte[] byteArray) { long msb = 0; long lsb = 0; for (int i = 0; i < 8; i++) msb = (msb << 8) | (byteArray[i] & 0xff); for (int i = 8; i < 16; i++) lsb = (lsb << 8) | (byteArray[i] & 0xff); UUID result = new UUID(msb, lsb); return result; } } |
输出:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | UUID String: cdaed56d-8712-414d-b346-01905d0026fe Number of Bytes: 36 UUID Byte Array: -51 -82 -43 109 -121 18 65 77 -77 70 1 -112 93 0 38 -2 Number of Bytes: 16 UUID Base64 String: za7VbYcSQU2zRgGQXQAm/g== Number of Bytes: 24 UUID Base64 String Trimmed: za7VbYcSQU2zRgGQXQAm/g Number of Bytes: 22 Back to UUID Byte Array: -51 -82 -43 109 -121 18 65 77 -77 70 1 -112 93 0 38 -2 0 38 Number of Bytes: 18 Fixed UUID Byte Array: -51 -82 -43 109 -121 18 65 77 -77 70 1 -112 93 0 38 -2 Number of Bytes: 16 UUID String: cdaed56d-8712-414d-b346-01905d0026fe Number of Bytes: 36 Equal to Start UUID? true |
我也在尝试做类似的事情。我正在使用使用
用法:
1 2 3 4 |
输出:
1 2 | as base64: b8tRS7h4TJ2Vt43Dp85v2A as uuid : 6fcb514b-b878-4c9d-95b7-8dc3a7ce6fd8 |
功能:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | import org.apache.commons.codec.binary.Base64; private static String uuidToBase64(String str) { Base64 base64 = new Base64(); UUID uuid = UUID.fromString(str); ByteBuffer bb = ByteBuffer.wrap(new byte[16]); bb.putLong(uuid.getMostSignificantBits()); bb.putLong(uuid.getLeastSignificantBits()); return base64.encodeBase64URLSafeString(bb.array()); } private static String uuidFromBase64(String str) { Base64 base64 = new Base64(); byte[] bytes = base64.decodeBase64(str); ByteBuffer bb = ByteBuffer.wrap(bytes); UUID uuid = new UUID(bb.getLong(), bb.getLong()); return uuid.toString(); } |
您可以安全地在此应用程序中删除填充" =="。如果要将base-64文本解码回字节,大多数库都希望它存在,但是由于您只是将结果字符串用作键,所以这不是问题。
我喜欢Base-64,因为它有限的字符集看起来不太像胡言乱语,但还有Base-85。它使用更多字符并将4个字节编码为5个字符,因此您可以将文本减少到20个字符。
这是我的代码,它使用org.apache.commons.codec.binary.Base64生成长度为22个字符(且具有与UUID相同的唯一性)的URL安全唯一字符串。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | private static Base64 BASE64 = new Base64(true); public static String generateKey(){ UUID uuid = UUID.randomUUID(); byte[] uuidArray = KeyGenerator.toByteArray(uuid); byte[] encodedArray = BASE64.encode(uuidArray); String returnValue = new String(encodedArray); returnValue = StringUtils.removeEnd(returnValue,"\ \ "); return returnValue; } public static UUID convertKey(String key){ UUID returnValue = null; if(StringUtils.isNotBlank(key)){ // Convert base64 string to a byte array byte[] decodedArray = BASE64.decode(key); returnValue = KeyGenerator.fromByteArray(decodedArray); } return returnValue; } private static byte[] toByteArray(UUID uuid) { byte[] byteArray = new byte[(Long.SIZE / Byte.SIZE) * 2]; ByteBuffer buffer = ByteBuffer.wrap(byteArray); LongBuffer longBuffer = buffer.asLongBuffer(); longBuffer.put(new long[] { uuid.getMostSignificantBits(), uuid.getLeastSignificantBits() }); return byteArray; } private static UUID fromByteArray(byte[] bytes) { ByteBuffer buffer = ByteBuffer.wrap(bytes); LongBuffer longBuffer = buffer.asLongBuffer(); return new UUID(longBuffer.get(0), longBuffer.get(1)); } |
我有一个应用程序,在其中我几乎可以做到这一点。 22个字符编码的UUID。它工作正常。但是,我这样做的主要原因是ID在Web应用程序的URI中公开,对于URI中显示的内容,36个字符确实很大。 22个字符仍然很长,但是我们做到了。
这是为此的Ruby代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 | # Make an array of 64 URL-safe characters CHARS64 = ("a".."z").to_a + ("A".."Z").to_a + ("0".."9").to_a + ["-","_"] # Return a 22 byte URL-safe string, encoded six bits at a time using 64 characters def to_s22 integer = self.to_i # UUID as a raw integer rval ="" 22.times do c = (integer & 0x3F) rval += CHARS64[c] integer = integer >> 6 end return rval.reverse end |
它与base64编码并不完全相同,因为base64使用的字符如果出现在URI路径组件中就必须转义。 Java实现可能大不相同,因为您更有可能拥有原始字节数组,而不是真正的大整数。
这是JDK8中引入的
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | import java.nio.ByteBuffer; import java.util.Base64; import java.util.Base64.Encoder; import java.util.UUID; public class Uuid64 { private static final Encoder BASE64_URL_ENCODER = Base64.getUrlEncoder().withoutPadding(); public static void main(String[] args) { // String uuidStr = UUID.randomUUID().toString(); String uuidStr ="eb55c9cc-1fc1-43da-9adb-d9c66bb259ad"; String uuid64 = uuidHexToUuid64(uuidStr); System.out.println(uuid64); //=> 61XJzB_BQ9qa29nGa7JZrQ System.out.println(uuid64.length()); //=> 22 String uuidHex = uuid64ToUuidHex(uuid64); System.out.println(uuidHex); //=> eb55c9cc-1fc1-43da-9adb-d9c66bb259ad } public static String uuidHexToUuid64(String uuidStr) { UUID uuid = UUID.fromString(uuidStr); byte[] bytes = uuidToBytes(uuid); return BASE64_URL_ENCODER.encodeToString(bytes); } public static String uuid64ToUuidHex(String uuid64) { byte[] decoded = Base64.getUrlDecoder().decode(uuid64); UUID uuid = uuidFromBytes(decoded); return uuid.toString(); } public static byte[] uuidToBytes(UUID uuid) { ByteBuffer bb = ByteBuffer.wrap(new byte[16]); bb.putLong(uuid.getMostSignificantBits()); bb.putLong(uuid.getLeastSignificantBits()); return bb.array(); } public static UUID uuidFromBytes(byte[] decoded) { ByteBuffer bb = ByteBuffer.wrap(decoded); long mostSigBits = bb.getLong(); long leastSigBits = bb.getLong(); return new UUID(mostSigBits, leastSigBits); } } |
用Base64编码的UUID是URL安全的,没有填充。
您没有说使用什么DBMS,但是如果您担心节省空间,似乎RAW是最好的方法。您只需要记住为所有查询进行转换,否则您将冒巨大的性能下降的风险。
但是我不得不问:您住的地方字节真的那么贵吗?
这并不是您所要求的(不是Base64),但值得一看,因为它具有更高的灵活性:有一个Clojure库实现了UUID的紧凑的26字符URL安全表示(https:// github .com / tonsky / compact-uuids)。
一些重点:
- 产生的字符串小30%(26个字符比传统的36个字符)
- 支持完整的UUID范围(128位)
- 编码安全(仅使用ASCII中的可读字符)
- URL /文件名安全
- 小写/大写安全
- 避免歧义字符(i / I / l / L / 1 / O / o / 0)
- 编码的26个字符的字符串按字母顺序排序符合默认的UUID排序顺序
这些是相当不错的属性。我一直在我的应用程序中使用此编码来存储数据库密钥和用户可见的标识符,并且效果很好。
以下是我用于UUID(组合样式)的内容。它包含用于将uuid字符串或uuid类型转换为base64的代码。我每64位执行一次,所以我不会处理任何等号:
爪哇
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | import java.util.Calendar; import java.util.UUID; import org.apache.commons.codec.binary.Base64; public class UUIDUtil{ public static UUID combUUID(){ private UUID srcUUID = UUID.randomUUID(); private java.sql.Timestamp ts = new java.sql.Timestamp(Calendar.getInstance().getTime().getTime()); long upper16OfLowerUUID = this.zeroLower48BitsOfLong( srcUUID.getLeastSignificantBits() ); long lower48Time = UUIDUtil.zeroUpper16BitsOfLong( ts ); long lowerLongForNewUUID = upper16OfLowerUUID | lower48Time; return new UUID( srcUUID.getMostSignificantBits(), lowerLongForNewUUID ); } public static base64URLSafeOfUUIDObject( UUID uuid ){ byte[] bytes = ByteBuffer.allocate(16).putLong(0, uuid.getLeastSignificantBits()).putLong(8, uuid.getMostSignificantBits()).array(); return Base64.encodeBase64URLSafeString( bytes ); } public static base64URLSafeOfUUIDString( String uuidString ){ UUID uuid = UUID.fromString( uuidString ); return UUIDUtil.base64URLSafeOfUUIDObject( uuid ); } private static long zeroLower48BitsOfLong( long longVar ){ long upper16BitMask = -281474976710656L; return longVar & upper16BitMask; } private static void zeroUpper16BitsOfLong( long longVar ){ long lower48BitMask = 281474976710656L-1L; return longVar & lower48BitMask; } } |