Capturing useful metadata for Compute services: looking for the community's thoughts on opt-in vs. opt-out

jaskiratr · October 5, 2023, 5:34pm

Hello,
We are exploring ways to capture useful metadata for Compute services, so we can better understand how your Wasm service is built and produced.

This will help in :

Debugging issues such as, if you are using any old versions of SDKs or CLI.
Understanding which platforms are you building your services so we prioritize features for those first
Learning about adoption of tooling and packages we provide to you. So we can fine tune the release cadence.

The metadata will include:

Build Information: Information regarding the time taken for builds and compilation processes, helping us identify bottlenecks and optimize performance.
Machine Information: Only general, non-identifying system specifications (CPU, RAM, operating system) to better understand the hardware landscape our CLI operates in.
Packages Used in Source: Packages utilized in your source code, enabling us to prioritize support for the most commonly used components.

You will also have the ability to selectively opt-in/out of the above categories.

What we’d like to hear from the community is your thoughts on whether data collection (in accordance with our data management and privacy policies) from the CLI should be opt-in or opt-out by default.

Here is an example of what metadata will be collected

What do you think?

jaskiratr · October 5, 2023, 5:45pm

I personally think that opting users in metadata collection by default is a safe choice, since the metadata doesn’t contain any sensitive information.

On the other hand, asking users to opt-in, is less likely to garner participation from less technical users. Therefore, a large number of users are likely to be excluded from the statistics. This is one of the biggest reasons that voluntarily participating doesn’t work well in practice. As explained by @theevilskeleton in this post. Though it talks about telemetry, but we are just focusing on one time metadata collection at the time of the build step and not continuous telemetry collection.

jaskiratr · October 17, 2023, 11:03pm

Responding to the thread https://mastodon.social/@devs@fastly.social/111184065168235367

I can grasp why it is not a great idea to opt-in users for using Go. And, I will point out the two key differences when comparing Go telemetry collection with Fastly Compute metadata.

1. Information type: Understandably telemetry collection is scary, as in case of Go. But, Fastly CLI will only collect metadata of the service that is to be deployed - just once during the build step. The same information can be presented back to the customers in the UI and CLI. That will provide developers the confidence in what is deployed. Whereas, in case of telemetry there is no value in exposing that information back to a user.

2. Target Users: Go’s is targeting millions of developers in the whole Go community. Whereas Fastly is handling it’s own customers data. Customers who today already provide their source code to Fastly in order to deploy their services across the globe. We and our customers have a mutual goal of attaining better visibility into what services have been deployed in production. Thus, easier debugging when things go south.

I hope this provides some clarity in how the data will be used.

Integralist · October 19, 2023, 8:47am

This makes a lot of sense to me.

Can we make sure your response is posted back to the Mastodon thread so those who took the time to comment can be included (as they might not be tracking this forum).

Thanks.

jaskiratr · October 23, 2023, 3:25pm

Yes, certainly. I replied in that Mastodon thread as well

kphill · November 8, 2023, 4:35pm

I would like to weigh in here in favor of making code metadata collection a default behavior that customers could choose to opt out of.

My rationale is that an anonymized roll-up of those data would be very useful to Fastly’s customers. As a customer, at a basic level, I would want to know what packages are most popular among other Fastly customers. Even better, I’d like to see some dimensionality such as by industry vertical and by scale. For example, what Rust packages are Compute customers in the commerce industry with over 10k requests/sec using?

jaskiratr · June 10, 2024, 5:43pm

Update : You can now view the metadata for your Compute services in the manage.fastly.com UI. Just go to Service Configuration → Package

jaskiratr · June 10, 2024, 6:39pm

Reminder : You can always update your preferences for what metadata is attached to your service version. Enable what you need. Disable what you don’t need.

Topic		Replies	Views
Sampling data through our Fastly setup for analysis? General	4	181	July 15, 2024
Including resources/assets in the built Compute package Compute	6	477	November 30, 2023
Tracing edge compute with OpenTelemetry General livestream	0	700	April 14, 2023
Have your say on the future of fastly.toml Compute future	21	381	August 3, 2024
Initiate outgoing websocket Compute	3	62	October 14, 2024

Capturing useful metadata for Compute services: looking for the community's thoughts on opt-in vs. opt-out

Related topics