{"id":19483,"date":"2025-04-12T14:48:44","date_gmt":"2025-04-12T14:48:44","guid":{"rendered":"\/?p=19483"},"modified":"2025-04-12T15:08:45","modified_gmt":"2025-04-12T15:08:45","slug":"write-a-parser-to-find-a-second-author-how-compact-can-that-be","status":"publish","type":"post","link":"\/?p=19483","title":{"rendered":"&#8220;Write a parser to find a second author&#8221;? How compact can that be?"},"content":{"rendered":"<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"fi4mi\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"fi4mi-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"fi4mi-0-0\">\n<p><span data-offset-key=\"fi4mi-0-0\">Commenting on https:\/\/x.com\/_jasonwei\/status\/1910398763476320422<\/span><\/p>\n<p>RichardKCollin2 wrote:<\/p>\n<p>Used as a search engine, full LLM queries are inefficient. The AI could write the web scraper and run it &#8211; if precise, more intelligent parts can be embedded. &#8220;find a cat&#8221;, &#8220;look for second author from _&#8221; &#8211; in ordinary programs as needed. &#8220;embedded AI algorithms for coding&#8221;.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"542hi\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"542hi-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"542hi-0-0\"><span data-offset-key=\"542hi-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"bt1nt\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"bt1nt-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"bt1nt-0-0\"><span data-offset-key=\"bt1nt-0-0\">It can be run independent of the AI any time and extensively where billions of web pages have to be processed. <\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"a0mg1\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"a0mg1-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"a0mg1-0-0\"><span data-offset-key=\"a0mg1-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"au6r1\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"au6r1-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"au6r1-0-0\"><span data-offset-key=\"au6r1-0-0\">It means distilling out coding the essence of &#8220;find something more complex&#8221; in ordinary programming languages. What are the roots of &#8220;find a second author&#8221; It is now in SQL and Regex and parse trees and things like that. It could be in really tiny neural nets and other things. <\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"9vt6q\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"9vt6q-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"9vt6q-0-0\"><span data-offset-key=\"9vt6q-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"6cvtb\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"6cvtb-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"6cvtb-0-0\"><span data-offset-key=\"6cvtb-0-0\">&#8220;Write a parser to find a second author&#8221;? How compact can that be? <\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"1s4j9\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"1s4j9-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"1s4j9-0-0\"><span data-offset-key=\"1s4j9-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"910mc\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"910mc-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"910mc-0-0\"><span data-offset-key=\"910mc-0-0\">Could the &#8220;second authors&#8221; all be coded in the web pages? That is the purpose of &#8220;use global open tokens for the whole internet so the Internet is pre-tokenized for instant AI uses&#8221;. Not scraping unverified text but pre-compile and encode the Internet as a whole. <\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"9qphr\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"9qphr-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"9qphr-0-0\"><span data-offset-key=\"9qphr-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"kcgr\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"kcgr-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"kcgr-0-0\"><span data-offset-key=\"kcgr-0-0\">It allows the AI groups to standardize, index and independently query the full internet.<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"a0hf9\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"a0hf9-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"a0hf9-0-0\"><span data-offset-key=\"a0hf9-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"14\" data-rbd-draggable-id=\"7hhb3\">\n<div class=\"\" data-block=\"true\" data-editor=\"82adc\" data-offset-key=\"7hhb3-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"7hhb3-0-0\"><span data-offset-key=\"7hhb3-0-0\">Richard Collins, The Internet Foundation<\/span><\/div>\n<\/div>\n<\/div>\n<div data-offset-key=\"7hhb3-0-0\">\n<hr \/>\n<p>Commenting on\u00a0<a href=\"https:\/\/x.com\/ai_ctrl\/status\/1910690884229771502\">https:\/\/x.com\/ai_ctrl\/status\/1910690884229771502<\/a><\/p>\n<\/div>\n<div data-offset-key=\"7hhb3-0-0\">There is 100% chance that humans using AIs will be trying to take over NOW. If AIs can operate on their own, that is simply escalation of what humans already face. How to hold humans and their corporations using AIs accountable comes first. Not &#8220;super-intelligence&#8221; but humans.<\/div>\n<div data-offset-key=\"7hhb3-0-0\">\n<hr \/>\n<\/div>\n<div data-offset-key=\"7hhb3-0-0\">Commenting on\u00a0<a href=\"https:\/\/x.com\/AllenInstitute\/status\/1910410726071546358\">https:\/\/x.com\/AllenInstitute\/status\/1910410726071546358<\/a><\/div>\n<div data-offset-key=\"7hhb3-0-0\">That is NOT complex, just a few dozen major branches, and a tiny 84,000 nodes and their links. Yes a good step, beautiful, but only a pretty picture now. Share those neuron networks in an open format to make it useful to 5.4 Billion humans using the Internet. Make it low cost.<\/div>\n<div data-offset-key=\"7hhb3-0-0\"><\/div>\n<div data-offset-key=\"7hhb3-0-0\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Commenting on https:\/\/x.com\/_jasonwei\/status\/1910398763476320422 RichardKCollin2 wrote: Used as a search engine, full LLM queries are inefficient. The AI could write the web scraper and run it &#8211; if precise, more intelligent parts can be embedded. &#8220;find a cat&#8221;, &#8220;look for second author from _&#8221; &#8211; in ordinary programs as needed. &#8220;embedded AI algorithms for coding&#8221;. \u00a0 <br \/><a class=\"read-more-button\" href=\"\/?p=19483\">Read More &raquo;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[73],"tags":[],"class_list":["post-19483","post","type-post","status-publish","format-standard","hentry","category-all-knowledge"],"_links":{"self":[{"href":"\/index.php?rest_route=\/wp\/v2\/posts\/19483","targetHints":{"allow":["GET"]}}],"collection":[{"href":"\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=19483"}],"version-history":[{"count":6,"href":"\/index.php?rest_route=\/wp\/v2\/posts\/19483\/revisions"}],"predecessor-version":[{"id":19489,"href":"\/index.php?rest_route=\/wp\/v2\/posts\/19483\/revisions\/19489"}],"wp:attachment":[{"href":"\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=19483"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=19483"},{"taxonomy":"post_tag","embeddable":true,"href":"\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=19483"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}