Archive for the ‘MDN’ Category

Reopening Accounts on MDN

May 22, 2016

MDN has paid writing staff, but is also supported by community contributions. Much like Wikipedia, anyone can see a problem, create an account, and fix it. Some of the most valuable community contributions are translating content into non-English languages, which would be very expensive to do with paid staff.

MDN assumes users want to be helpful, and publishes changes immediately. This openness leaves MDN vulnerable to spam. Over the years, new features have been added to monitor changes, revert unwanted ones, and ban users with ill intent. These small mitigations were enough for the small volume of spam, about 1% of all edits.

On February 12th, a new phase in the spam fight started. Over 50% of the new accounts created that day were used to create spam, and spam was being created faster than staff could handle it. We responded by shutting down new accounts, which stopped the problem, but also shut the door on legitimate new contributors for almost three months.

There’s a few things we tried but didn’t work:

  • Adding reCAPTCHA to account creation. The spammers were able to bypass it and create as much spam as before. This suggests humans are in the spam creation loop.
  • Turning on our first Akismet integration. We had developed custom code to send edits to Akismet, but hadn’t had a chance to train it on our content before the attack. Too much spam was published, and too many legitimate edits were blocked.

After some further mitigations, we turned on account creation on April 26th, and have been able to manage the amount of new spam. Here’s what did work:

  • Analyzing the attack. With some scripting, I was able to gather data about the attack, make some calculations, and create some graphs. We were able to reject some planned mitigations, such as delaying the time to the first edit (it turns out legit editors are just as fast as spammers).  The time we saved not implementing bad fixes meant more time for the fixes that worked.
  • New users can no longer create pages by default. During the spam attack, over 90% of new pages were spam. Removing this ability has cut spam back down to manageable levels. The page creation privilege can be added when requested, but the spammers don’t bother.
  • Content training is improved. At first, we trained Akismet on historical spam and historical “good” edits. We rolled out training on live edits, and then turned on edit blocking, with further training for incorrectly blocked edits.
  • Removing spam magnets from profiles. Some fields on the user profiles are used more by spammers than legitimate users, and have been removed.

Spam is back down to less than 2%. There’s still more to do. I expect that the spammers are temporarily blocked, and will return once they have a new strategy. I want to be ready.

It’s good to have account creation open. Many new pages have been translated, many small typos fixed. We’re starting to see ambitious new changes as well. It also has increased MDN staff workload, monitoring all the changes, finding bugs, and keeping a busy site working.  The increased workload broke my one-a-week blog habit.  Still, better than one post every two years.

 

 

 

Advertisement

This is not XP

April 25, 2016

Today is day 1 of iteration 1 of the new MDN durable team. There’s a lot of new vocabulary, including the agile vocabulary that means something different on every team. We did prep work last week. Some of the team took classes in user stories and formulating tasks. We went over some of the tasks as a team, and broke up into smaller groups for the other tasks. This gives us a backlog for our first 3 week iteration.

My experience is on co-located development teams, using Extreme Programming (XP). In the past, I’ve used Shore and Warden’s The Art of Agile Development, which describes the authors’ 37 practices modeled after XP and Scrum. It was very useful when I first led an agile team, and I’ve used it in agile teams since.

Chapter 4, “Adopting XP”, includes some prerequisites for using XP. Here’s how the MDN durable team measures up:

  1. Management Support – The marketing organization is mandating the agile process. We have to work with other Mozilla organizations who don’t.
  2. Team Agreement – The team attitudes range from acceptance to hostility.
  3. A Colocated Team – No two people share an office. The team members are all remote and cover US and European time zones.
  4. On-Site Customers – We have a product manager, who is new to MDN this year. MDN has many customers, from internal Mozilla teams that use MDN to document their interfaces to code schools that use MDN as a reference.
  5. The Right Team Size – There are 10 team members: 2 developers, 5 content creators, and 3 managers (Product, Project, and Program). We’re augmenting this with contractors and volunteers.
  6. Use all the Practices – It’s not being run as an XP project, and we couldn’t if we wanted to.

I’d rate us around 50% for the prerequisites. We can’t overcome some of these without big changes in the team.

The chapter also includes some recommendations:

  1. A Brand-New Codebase – Kuma is 6 years old, and forked from another project. Some of the content goes back to 2005.
  2. Strong Design Skills – I’m good when I start from scratch, like BrowserCompat. An existing code base takes a different set of skills, and I have some experience here too. The legacy content is a different problem, both working with it and convincing others to prioritize tackling it. I’m also not in charge, so my skills are often irrelevant.
  3. A Language That’s Easy to Refactor – Python wins. KumaScript loses, but maybe is fixable.
  4. An Experienced Programmer-Coach – I think everyone is new to the Marketing org’s process. We have lots of agile experience as individuals, but this seems like this is neutral at best at working with the new process.
  5. A Friendly and Cohesive Team – Most of the team members still think of themselves as either on the content team or the development team.

I rate us below 50% on being able to follow the recommendations.

I wouldn’t recommend this as an XP project. But it’s not. It’s adopting this other process, new to me. I’m realizing how irrelevant my experience is, other than learned trust that we’ll adjust as we go. There’s time for 11 iterations this year, 11 chances to adjust.

It will be an interesting 3 weeks, and quite a year.

Starting a new phase at MDN

April 18, 2016

At the start of 2016, I was the API designer and back end developer for the BrowserCompat  project, with the goal of serving Mozilla Developer Network’s (MDN’s) compatibility data from an API.  I had been doing this for a while, one year as a part-time contractor, and six months as a Mozilla employee. Technically, I was an MDN web developer, but I was happy to let the other four developers do the bulk of the MDN work while I focused on implementing BrowserCompat.

It’s four months later, and everything has changed.

MDN development has moved from the Firefox organization to the Marketing organization.  Marketing developers work on important codebases, such as bedrock, which runs www.mozilla.org, and snippets, which has to quickly serve content for the Firefox Start Page.  MDN will benefit from their expertise in moving these services to modern cloud infrastructures.

While MDN development moved, the three other back end developers stayed in the Firefox organization. This makes me the sole back end developer on MDN. I had some light experience with Kuma, the MDN engine, mostly reviewing other’s pull requests. I’ve dived into the deep end of Kuma development for about a month.

Moving MDN to Marketing puts it in the same organization as the writers, allowing the formation of the MDN durable team.  I’m not 100% sure how “agile marketing practice” works (I haven’t read the book yet). We’ve gained some new management, and will start learning the new processes in earnest this Week. It will be interesting to see what MDN goals looks like, as opposed to separate content and development goals.

Our first goal was decided for us. In February, MDN was attacked by automated spammers, creating hundreds of pages advertising garbage that has nothing to do with the open web.  Stopping the spam required halting new account creation. I’ve been working on spam mitigations since March, and we hope to open account registration again soon.

This has been a hard transition. I really liked my job in January, and was very comfortable with the next steps to success. This new job is a lot more confusing, and changes every week. It is also a lot more important to Mozilla than my little experiment ever was. I’m sure it will take a few months for me to get comfortable with the code, the infrastructure, the new processes, and the new co-workers.  There’s no time for a gentle transition, since the site is under attack, and everyone is trying to figure out how MDN and its millions of visitors fit in the new organization.

I still get to develop open-source software every week, at a company dedicated to openness.  Being open means showing off stuff before it is 100% ready. That includes me. I hope blogging about this next phase of MDN will help me figure it out for myself.