Timeline Sandbox

@alexonit@twtxt.alessandrocutolo.it

Writing code for work, fun and everything in between.

@alexonit@twtxt.alessandrocutolo.it

@lyse I think will be bad if handled incorrectly.

The client must reference both properly or it would miss posts, including both this way is a bit pointless if you can't use the hash or url separately.

Being a highly likely a breaking change anyway I think @zvava proposal looks much better.

In reply to: #2kiw2vq 6 months ago
@alexonit@twtxt.alessandrocutolo.it

@prologic I think nobody will stop you if you replace the current hashing with SHA-256 if you call it improvement™ 😉

In reply to: #ce7zzfq 6 months ago
@alexonit@twtxt.alessandrocutolo.it

That's what I'm using right now, while my own client is still in the making.

A simple bash script to write a post in a mktemp file then clean it with regex. I don't even bother to hash the replies, I just open https://twtxt.net and copy the hash by hand since I'm checking the new posts from there anyway (temporarily, as I might end up DoS-ing everyone's feed in my client right now).

In reply to: #ce7zzfq 6 months ago
@alexonit@twtxt.alessandrocutolo.it

@prologic While it might work if you want to keep both, I think the point was to be able to use one or the other, if we still have to generate the hash anyway it might be pointless to use this format.

In reply to: #7fsi7yq 6 months ago
@alexonit@twtxt.alessandrocutolo.it

@prologic That is really great to hear!

If there are opposing opinions we either build a bridge or provide a new parallel road.

Also, I wouldn't call my opinion a "stance", I just wish for a better twtxt thanks to everyone's effort.

The last thing we need to do is decide a proper format for the location-based version.

My proposal is to keep the "Subject extension" unchanged and include the reference to the mention like this:

// Current hash format: starts with a '#'
(<a href="?search=hash" class="tag">#hash</a>) here's text
(<a href="?search=hash" class="tag">#hash</a>) <a href="/timeline/profile?url=url">@nick</a><a href="url" class="webmention"></a> here's text

// New location format: valid URL-like + '#' + TIMESTAMP (verbatim format of feed source)
(url#timestamp) here's text
(url#timestamp) <a href="/timeline/profile?url=url">@nick</a><a href="url" class="webmention"></a> here's text

I think the timestamp should be referenced verbatim to prevent broken references with multiple variations (especially with the many timezones out there) which would also make it even easier to implement for everyone.

I'm sure we can get @zvava, @lyse and everyone else to help on this one.

I personally think we should also consider allowing a generic format to build on custom references, this would allow for creating threads using any custom source (manual, computed or external generated), maybe using a new "Topic extension", here's some examples.

// New format for custom references: starts with a '!' maybe?
(!custom) here's text
(!custom) <a href="/timeline/profile?url=url">@nick</a><a href="url" class="webmention"></a> here's text

// A possible "Topic" parse as a thread root:
[!custom] start here
[custom] simpler format

This one is just an idea of mine, but I feel it can unleash new ways of using twtxt.

In reply to: #3h7w7ca 6 months ago
@alexonit@twtxt.alessandrocutolo.it

@lyse @prologic Can't we find a middle ground and support both?

The thread is defined by two parts:

  1. The hash
  2. The subject

The client/pod generate the hash and index it in it's database/cache, then it simply query the subject of other posts to find the related posts, right?

In my own client current implementation (using hashes), the only calculation is in the hash generation, the rest is a verbatim copy of the subject (minus the # character), if this is the common implemented approach then adding the location based one is somewhat simple.

function setPostIndex(post) {
    // Current hash approach
    const hash = createHash(post.url, post.timestamp, post.content);

    // New location approach
    const location = post.url + '#' + post.timestamp;

    // Unchanged (probably)
    const subject = post.subject;

    // Index them all
    addToIndex(hash, post);
    addToIndex(location, post);
    addToIndex(subject, post);
}

// Both should work if the index contains both versions
getThreadBySubject('<a href="?search=abcdef" class="tag">#abcdef</a>') => [post1, post2, post3]; // Hash
getThreadBySubject('https://example.com#2025-01-01T12:00:00') => [post1, post2, post3]; // Location

As I said before, the mention is already location based <a href="/timeline/profile?url=https://example.com/twtxt.txt">@example</a><a href="https://example.com/twtxt.txt" class="webmention"></a>, so I think we should keep that in consideration.

Of course this will lead to a bit of fragmentation (without merging the two) but I think this can make everyone happy.

Otherwise, the only other solution I can think of is a different approach where the value doesn't matter, allowing to use anything as a reference (hash, location, git commit) for greater flexibility and freedom of implementation (this probably need the use of a fixed "header" for each post, but it can be seen as a separate extension).

In reply to: #3h7w7ca 6 months ago
@alexonit@twtxt.alessandrocutolo.it

@prologic I can see the issues mentioned, but I think some can be fixed.

  1. The current hash relies on a url field too, by specification, it will use the first # url = <URL> in the feed's metadata if present, that too can be different from the fetching source, if that field changes it would break the existing hashes too, a better solution would be to use a non-URL key like # feed_id = <UNIQUE_RANDOM_STRING> with the url as fallback.

  2. We can prevent duplications if the reference uses that same url field too or the client "collapse" any reference of all the urls defined in the metadata.

  3. I agree that hashing based on content is good, but we still use the URL as part of the hashing, which is just a field in the feed, easily replicable by a bot, also noting that edits can also break the hash, for this issue an alternative solution (E.g. a private key not included in the feed) should be considered.

  4. For offline reading the source would be downloaded already, the fetching of non followed feeds would fill the gap in the same way mentions does, maybe I'm missing some context on this one.

  5. To prevent collisions there was a discussion on extending the hash (forgot if that was already fixed or not), but without a fallback that would break existing clients too, we should think of a parallel format that maintains current implementations unchanged, we are already backward compatible with the original that don't use threads at all, a mention style format for that could be even more user-friendly for those clients.

We should also keep in mind that the current mention format is already location based (<a href="/timeline/profile?url=https://example.com/twtxt.txt">@example</a><a href="https://example.com/twtxt.txt" class="webmention"></a>) so I'm not that worried about threads working the same way.

Hope to see some other thought about this matter. 🤓

In reply to: #altkl2a 6 months ago
@alexonit@twtxt.alessandrocutolo.it

@zvava @lyse I also think a location based reference might be better.

A thread is a single post of a single feed as a root, but the hash has the drawback of not referencing the source, in a distributed network like twtxt it might leave some people out of the whole conversation.

I suggest a simpler format, something like: (#<TIMESTAMP URL>)

This solves three issues:

  • Easier referencing: no need to generate a hash, just copy the timestamp and url, it's also simpler to implement in a client without the rish of collisions when putting things together
  • Fetchable source: you can find the source within the reference and construct the thread from there
  • Allow editing: If a post is modified the hash becomes invalid since it depends on [ timestamp, url, content ]
In reply to: #dvw775q 6 months ago
Comment via email