Log in

Previous Entry | Next Entry

An old friend of mine and I were brainstorming about how the guy who wrote the "I Write Like" site (iwl.me) wrote the program that does the comparison, and I realised during that brainstorming, that Dmitry (the guy) has to have a decent selection of electronic copies of each author's work, in order to create the database or hash table that the submitter's work is compared to.

Which means that one possible scenario -- and I'm not saying it's true, I'm saying it's possible -- is that, where the texts of a given author aren't in the public domain, Dmitry picked the authors he did, and is continuing to pick the authors he is picking, because their work is available as pirated copies on file-sharing torrents (S.W.A.G., but I bet I'm right).

That would produce one /more/ interesting thing about this little meme, above and beyond the amusing results of purposefully manipulating the input text and the entirely regrettable results of purposefully pointing out to the author an area in which he could improve his toy.

It could be entirely the case that he has a decent selection of the works of each author, acquired through legitimate channels, in plain vanilla text format, unencumbered by DRM/encryption schemes, to analyse or generate hash tables from or build his database. I honestly do not know one way or the other.

Let's also consider that, for the purposes of getting a good sample of work, you need a minimum of one work from each author. At an Amazon-esque price of ten dollars US per legitimate, licensed digital copy of each work, and roughly thirteen authors I count available as results of the "I Write Like" test, whose works are not in the public domain / copyrights expired, that's a minimum of a hundred and thirty US dollars to get source materials, at one per author.

That kind of cost is not exactly burdensome, but neither is it trivial.

There is the question of, if he does in fact have digital copies of these works from legitimate sources, do the licenses he bought for these digital copies allow him to make derivative and transformative works from those texts? Fair Use could be argued, however, the iwl.me site has a link to codingrobots.com, which advertises a commercial product and offers it for sale, called Blogjet. I'm not a lawyer, but I'm pretty sure that transformative and derivative works of copyrighted works are not covered by Fair Use when they're used for commercial purposes, such as promoting a business. And I'm pretty sure that J.K. Rowling has a finger on the trigger of her tort lawyer team to go after fanfic that's making a buck, then she'd probably have her finger on the trigger of her tort lawyer team to go after someone using her works to promote their own business.

But I'm not a lawyer AND NONE OF THIS IS LEGAL ADVICE so I could be wrong.

But I think that the implications of Mr. Dmitry's activities are going to get very much more interesting very soon.


( 2 comments — Leave a comment )
Jul. 16th, 2010 05:19 am (UTC)
Yeah, I'm very interested to know if it can be done legitimately - assuming, for instance, that you bought a paper copy of The Stand, is there anything you could do to convert that paper copy into a database which drives that website without infringing the copyright? And I honestly don't know the answer.

But, like you, I strongly suspect that something brazenly illegitimate has occurred, rendering that question mostly moot.
Jul. 20th, 2010 11:56 am (UTC)
Or, as seems to actually be the case, the correlation with any given writer's style is damn near random - making it most likely the result of simply hashing the input so the same text will produce the same results each time.

Given that the site IS A SCAM devoted to getting you to repost the results, which contain a link to a Pay-To-Publish scammer, thus raising his Google rank? Yeah.
( 2 comments — Leave a comment )

Latest Month

February 2011
Powered by LiveJournal.com
Designed by Lilia Ahner