Blog

Apr 4, 2018

PyTorch Should Be Copyleft

Posted by in category: robotics/AI

Neural networks have started to take off since AlexNet in 2012. We don’t have to call it a software war, but there’s a competition for mindshare and community contributors in neural networks.

Of course, AI needs more than a neural network library, it needs the configuration hyperparameters, training datasets, trained models, test environments, and more.

Most people have heard of Google’s Tensorflow which was released at the end of 2015, but there’s an active codebase called PyTorch which is easier to understand, less of a black box, and more dynamic. Tensorflow does have solutions for some of those limitations (such as Tensorflow-fold, and Tensorflow-Eager) but these new capabilities remove the need for other features and complexity of Tensorflow. Google built a high-performance system for doing static computation graphs before realizing that most people want dynamic graphs. Doh!

And how much do you trust Google, anyway?

PyTorch was created by people from Idiap Research Institute in Switzerland, who went to Facebook and Google. Doh!

I posted a bug report on the PyTorch license, asking for a copyleft one: https://github.com/pytorch/pytorch/issues/5270

I think you should consider a copyleft license. I realize it’s a pain to change the license, but it never gets easier. I read the license and it’s mostly a disclaimer and a warning. There’s nothing in there about protecting the freedom of the users.

There are lots of projects with lax licenses that are successful, so maybe it will work out okay, but the Linux kernel took off because of the copyleft license. It nudges people to give back.

Lax licenses let companies take advantage of the individual contributors. I don’t understand how someone who believes in free software also believes letting big companies turn it back into proprietary software is fine.

I realize lawyers might like that, and proprietary software companies might want it, but this group is more than just those people. It’s great you’ve got 100s of contributors already, but if you know the way corporations work, you should be pushing for copyleft.

My bug was closed within 8 hours with the following response from a Facebook employee:

we’ve definitely thought about this in the past. We have no plans of changing our license.

The bug was closed but I could keep commenting:

When you say “we”, are you talking about Facebook or the random smaller contributors? Given you work for a large company, I hope you realize you could be biased. At the same time, you should know the way large corporations work even better. You won’t be there forever. Copyleft is stronger protection for the software and the users, do you disagree?

When you say “thought”, have you written any of it down with a link you can post for archival purposes? That way if others come along, they’ll have a good answer. I may quote your non-defense of your lax license in my writings if you don’t mind, but I’d prefer if you gave me a bit more.

I just spend several minutes looking for a discussion on PyTorch license, and came up with nothing except another bug report closed with a similar short answer.

Your last dismissive answer could motivate people to create a copyleft fork!

I got one more response:

We = the authors of the project.

“thought” = this is a topic that came up in the past, we discussed it among ourselves. I don’t have it written down, we don’t plan to have it written down.

I wrote one more response:

It don’t know any of these names:
https://www.openhub.net/p/pytorch/contributors

I don’t know who the authors are of this project, and how much is big companies versus academics and small contributors, how much interest there is in making a copyleft version, etc.

BTW, relicensing would get you plenty of news articles. It’s also tough because Facebook doesn’t have the same reputation as the FSF or EFF for protecting user’s freedom. The Tensorflow license is lax also so you don’t have that competitive advantage.

To some it’s a disadvantage, but it did make a difference in the Linux scheme, and you would hope to have your work be relevant for that long, and without a bunch of proprietary re-implementations over time that are charged for. The lax license could also slow software innovation because everyone is mostly improving their secret code on top.

LibreOffice was able to convince a lot of people that a copyleft license was better than the OpenOffice scheme, but I don’t know what people here think. One interesting data point would be to find out what percent of the patches and other work are by small contributors.

Anyway, you’ve got a cool project, and I wish you the best, partially because I don’t trust Google. Tensorflow is just some sample code for others to play with while they advance the state of the art and keep 95% proprietary. It also seems they made a few mistakes in the design and now will carry baggage.

There is a deep learning software wars going on. It’s kind of interesting to almost be on the side of Facebook wink

It’s a shame that copyleft seems to be losing mindshare. If the contributors who like copyleft lit some torches, and created a fork, or threatened to, it could get the attention of the large corporations and convince them to relicense rather than risk the inefficiencies, bad press, slower progress and loss of relevance. Forks are a bad thing, but copyleft can prevent future forks, and prevent people from taking but not giving back.

Whether a PyTorch fork makes sense depends on a number of factors. The LibreOffice fork was created because people were unhappy about how Sun and then Oracle were working with the community, etc. If the only thing wrong with PyTorch is the lax license, it might become successful without needing the copyleft nudge, but how much do you trust Facebook and Google to do the right thing long-term?

I wish PyTorch used the AGPL license. Most neural networks are run on servers today, it is hardly used on the Linux desktop. Data is central to AI and that can stay owned by FB and the users of course. The ImageNet dataset created a revolution in computer vision, so let’s not forget that open data sets can be useful.

A license like the GPL wouldn’t even apply to Facebook because the code runs on servers, but it would make a difference in other places where PyTorch could be used. You’d think Facebook could have just agreed to use a GPL or LGPL license, and silently laugh as they know the users don’t run their AI software.

Few people run Linux kernels remotely so the GPL is good enough for it. Perhaps it isn’t worth making a change to the PyTorch license unless they switch to AGPL. Or maybe that’s a good opening bid for those with torches and pitchforks.

I posted a link to this on the Facebook Machine Learning group, and my post was deleted and I was banned from the group!

I posted a link to the Google Deep Learning group and got some interesting responses. One person said that copyleft is inhibiting. I replied that if keeping free software free is inhibiting, there isn’t a word to describe the inhibitions with proprietary software!

One of the things I notice is that even though many people understand and prefer copyleft, they often encourage a lax license because they think other people want that also. There are a lot of people pushing for lax licenses even though they actually prefer copyleft.

People inside Facebook and Google know the pressure to write proprietary code better than those outside. They should be pushing for copyleft the most! On Reddit, someone suggested the MPL license. It does seem another reasonable compromise similar to LGPL.

Comments are closed.