Matrixstats is a public catalog for matrix rooms. It provides an easy way to navigate a lot of rooms from public homeservers and sort those rooms in different ways.
We use the bot to collect information about the servers. The information is used to fill the catalog and homeservers page. It includes public rooms information and statistics for server responses.
The bot can also be invited into specific room to provide an additional stats like number of unique senders and total messages for period for given room. This information is used to highlight active rooms and make them more visible for future visitors. The statistics can also be viewed on the room details page afterwards.
For public homeservers we collect public rooms details, amount of successfull/failed requests, and the information about last synchronization time.
For joined rooms we collect events ids and mxids, that are used to count and deduplicate events. The information is stored up for a month and then collapsed into the single numbers. We don't collect anything except event ids and mxids.
No. The bot doesn't store messages in any way. It's only used to collect statistical data.
While we can't guarantee this in any reliable way, we belive that project goals are transparent enough not to consider the bot as a threat. It's just a room member and its behavior should be considered in the same way.
We decided to split registrations between the homeservers for a few reasons. First, it allows to collect rooms from disconnected federations, i.e each server can be queried for its own rooms even if they not federated. Some rooms may be interesting for visitors, but unavailable in usual ways. Second, it allows to query the server as regular user and collect health statistics like number of successful/failed sync requests. This information can be used to notify users about the server problems. And third, distributed systems can be more fail-tolerant in case of some federation-level issues.
The bot just mimics client behavior, i.e awaits events from the server and retrying if neccessary. It works 24/7 however, while the other users may rejoin periodically. In terms of load it shouldn't differ too much from the regular client.
We had limited performance issues in the past related to unoptimal sync logic. The things was changed greatly from this time, and it shouldn't be an issue anymore.
We turned off sync functionality for most homeservers due to fact it can be abused. The bot can be exploited by homeserver users to organise denial of service attack that can't be easily prevented. We are working on mechanism to prevent this, however, it can take some time.
If you interested in sync functionality, please, send us a message. We will turn the sync on, and the bot can be invited to the rooms and gather healthmap and statistical data.
Homeservers are added manually at moment. While the first wave of servers was collected automatically, we decided not to continue this practice anymore. The reason is that some homeserver owners may not be interested in public catalogs even if the server considered as public. This cannot be predicted reliably, so the manual discovery was chosen.
We log each request to the server and calculate the ratio between successful and failed ones. We consider "200 OK" as successful response and everything else as failed one along with possible connection errors.
Status describes the current state of the homeserver discovery. While "Public" and "Private" statuses are clear enough, the others may be missleading. Basically, the "Confirmed" status most likely means that the server is "Public" but registration is protected with captcha. And "Unknown" status most likely means that registration is open, but the bot can't register due to missing registration flow on the bot side.
We currently support only m.login.dummy flow for automated registrations. For other flows, manual registration should be used in order to bypass captcha or email validation. It may take some time then. [ We also planning to add some sort of contact form that can be used for requesting manual registration, but it's not ready yet. This section will be expanded later.]