{"id":17243,"date":"2024-09-15T02:40:44","date_gmt":"2024-09-15T02:40:44","guid":{"rendered":"\/?p=17243"},"modified":"2024-09-15T02:53:40","modified_gmt":"2024-09-15T02:53:40","slug":"17243","status":"publish","type":"post","link":"\/?p=17243","title":{"rendered":"Tokenizing and compressing, so humans and computers can solve large problems with finite memory &#8211; where it matters"},"content":{"rendered":"<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"aeu3h\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"aeu3h-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"aeu3h-0-0\"><span data-offset-key=\"aeu3h-0-0\"><span data-offset-key=\"aeu3h-0-0\">Yi Ma @YiMaTweets Learning is all about maximizing information. We compress to learn and we learn to compress.<br \/>\nReplying to @YiMaTweets<br \/>\n<\/span><\/span><\/p>\n<hr \/>\n<p><span data-offset-key=\"aeu3h-0-0\">Filed as <em>Tokenizing and compressing, so humans and computers can solve large problems with finite memory &#8211; where it matters<\/em><\/span><\/p>\n<p>Yi Ma, It is not maximizing information so much as maximizing the chance that the information is remembered and applied correctly in critical situations, especially where human lives are on the line. Where it matters, a lot. Where communications are limited, tokenizing to global open verified identifiers greatly compresses global scale systems. That compression allows getting critical messages through without error.<\/p>\n<p>We compress to allow our finite human memories to hold more in memory at once. There are many problems where &#8220;having the whole problem in mind at once&#8221; is a critical part of solving it. So tokenizing (one way of compressing) is simply making it so the problem can be solved at all &#8211; using unaided humans. A global open token is easier to remember and apply consistently so there is no ambiguity in knowing what the pieces mean. In a game or process, knowing the precise steps, practicing, and then doing them exactly might save lives, or get the job done where it really matters.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"7ih4b\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"7ih4b-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"7ih4b-0-0\"><span data-offset-key=\"7ih4b-0-0\">Compressing to learn &#8211; there are problems where the steps are hard for humans to remember exactly but a correct resolution requires that steps be done precisely as the right time. The movie, Hidden Figures, about women doing calculations &#8220;by hand&#8221; is a good example. By constant practice, and by their skill in remembering things, they could come to solutions to safely get humans safely to orbit and back. Any one piece meant the difference between life and death.<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"97tta\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"97tta-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"97tta-0-0\"><span data-offset-key=\"97tta-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"24hqo\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"24hqo-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"24hqo-0-0\"><span data-offset-key=\"24hqo-0-0\">There are many life-critical things in life. Knowing what to avoid, and where to go, how to find things, how to do things. These are good survival skills. If you have no food, will you ask your neighbors? In mathematics, the symbols for things are small, and if your brain is trained to instantly encode precisely and without error, a finite human brain can hold all the pieces in mind, in order to make correct decisions and choices. We have hierarchies where data is collected and encoded, put into computers and filtered to the right people who can hold things in mind. They can then make good decisions. But if the computer systems are using words that can be ambiguous, they (with their finite memories allocated) will not be able to solve it. We see that already in AI applications might have not stored enough of the original data to used it because the sources of data is lost. Literally not stored or not accessible. <\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"4gqbv\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"4gqbv-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"4gqbv-0-0\"><span data-offset-key=\"4gqbv-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"6siho\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"6siho-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"6siho-0-0\"><span data-offset-key=\"6siho-0-0\">It is hard to know what you are aiming to do. Your posts are short.<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"e198\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"e198-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"e198-0-0\"><span data-offset-key=\"e198-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"37ku8\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"37ku8-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"37ku8-0-0\"><span data-offset-key=\"37ku8-0-0\">I found that ( any datasets and systems that were built &#8220;by accretion&#8221; by humans &#8212; generally they can be compressed by a factor of about 200x by fairly simple algorithms.)\u00a0 So taking problems where they have a billion items, that is likely to compress, losslessly, to about 5E6 (5 Million). When all you have are computers with their limited memories, and the job &#8220;has to be done exactly right or people die&#8221; you encode where needed, do not leave anything out, and likely pray constantly that you do it right.<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"29qm4\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"29qm4-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"29qm4-0-0\"><span data-offset-key=\"29qm4-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"423f0\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"423f0-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"423f0-0-0\"><span data-offset-key=\"423f0-0-0\">In large meetings, in large organizations, with complex problems evolving quickly, I have learned to listen and read, encode everything efficiently, and then be able to summarize it in mind as one scene or one movie, or one dynamic process &#8211; and then give directions that work for the whole.<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"rhjc\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"rhjc-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"rhjc-0-0\"><span data-offset-key=\"rhjc-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"4cue4\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"4cue4-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"4cue4-0-0\"><span data-offset-key=\"4cue4-0-0\">Please try to remember this I am writing now. I think it might make a difference to you and people around you.<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"834kk\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"834kk-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"834kk-0-0\"><span data-offset-key=\"834kk-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"9l7s2\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"9l7s2-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"9l7s2-0-0\"><span data-offset-key=\"9l7s2-0-0\">Richard Collins, The Internet Foundation<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"82sso\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"82sso-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"82sso-0-0\"><span data-offset-key=\"82sso-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"3nula\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"3nula-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"3nula-0-0\"><span data-offset-key=\"3nula-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"60vkd\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"60vkd-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"60vkd-0-0\"><span data-offset-key=\"60vkd-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"9pgni\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"9pgni-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"9pgni-0-0\"><span data-offset-key=\"9pgni-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"6vaq7\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"6vaq7-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"6vaq7-0-0\"><span data-offset-key=\"6vaq7-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"4\" data-rbd-draggable-id=\"85cdi\">\n<div class=\"\" data-block=\"true\" data-editor=\"cjfb4\" data-offset-key=\"85cdi-0-0\"><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Yi Ma @YiMaTweets Learning is all about maximizing information. We compress to learn and we learn to compress. Replying to @YiMaTweets Filed as Tokenizing and compressing, so humans and computers can solve large problems with finite memory &#8211; where it matters Yi Ma, It is not maximizing information so much as maximizing the chance that <br \/><a class=\"read-more-button\" href=\"\/?p=17243\">Read More &raquo;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[77,73,72],"tags":[],"class_list":["post-17243","post","type-post","status-publish","format-standard","hentry","category-all-global-open-devices","category-all-knowledge","category-all-languages"],"_links":{"self":[{"href":"\/index.php?rest_route=\/wp\/v2\/posts\/17243","targetHints":{"allow":["GET"]}}],"collection":[{"href":"\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=17243"}],"version-history":[{"count":4,"href":"\/index.php?rest_route=\/wp\/v2\/posts\/17243\/revisions"}],"predecessor-version":[{"id":17247,"href":"\/index.php?rest_route=\/wp\/v2\/posts\/17243\/revisions\/17247"}],"wp:attachment":[{"href":"\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=17243"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=17243"},{"taxonomy":"post_tag","embeddable":true,"href":"\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=17243"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}