The Wayback Machine - https://web.archive.org/web/20250407094124/https://phabricator.wikimedia.org/project/profile/1014/
Page MenuHomePhabricator

hardware-requestsComponent
ArchivedPublic

Members (2)

Watchers (1)

Details

Description

Information on how to file requests for hardware can be found here. The information below is a terse version of the linked information.

When a developer or engineer requires bare metal servers to be allocated to a project or cluster, the request for said hardware goes in this project.

If your service can be tested in Wikimedia Labs and has not, it will likely be denied until it has been labs tested. Please include any labs testing information in regards to your service or hardware request. If your project cannot be tested in labs, please include details as to why. All projects, regardless of testing status, should have basic puppetization of the service, users, and access needed before requesting a server for production use.

Please include the following for your request

  • Site: Where does this hardware need to run, typically this is codfw or eqiad if its not for a caching center.
  • Server Type: Traditional (99.999% of all requests) - If this doesnt require a normal server, but some special appliance, please note it.
  • Server Specifications: What are your requirements for CPU/RAM/Storage. If you have a ratio for your cpu core and memory allocation, please note.
  • Internal or External IP: Does this service need to present to the web?
  • SSL Requirements: If it needs HTTPS, can it go behind our misc-web varnish servers or does it need to terminate its own SSL?
  • Projected Duration of need: Is this for a 3 month testing project or a permanent service?

A prepopulated ticket can be created by the following link: Hardware request

Recent Activity

Jul 29 2024

Krinkle set the color for hardware-requests to Red.
Jul 29 2024, 1:27 PM
Krinkle set the color for hardware-requests to Red.
Jul 29 2024, 1:27 PM

Nov 20 2023

Maintenance_bot removed a project from T159207: Create one oresrdb VM in codfw: Patch-For-Review.
Nov 20 2023, 12:33 PM · hardware-requests, SRE, Machine-Learning-Team
isarantopoulos moved T159207: Create one oresrdb VM in codfw from Unsorted to 2023-2024 Q3 Done on the Machine-Learning-Team board.
Nov 20 2023, 12:19 PM · hardware-requests, SRE, Machine-Learning-Team

Aug 31 2022

Maintenance_bot removed a project from T184551: EQIAD: (1) hardware request for eventlog1001 replacement - eventlog1002.: Patch-For-Review.
Aug 31 2022, 11:30 PM · Analytics-Radar, hardware-requests, SRE
gerritbot added a comment to T184551: EQIAD: (1) hardware request for eventlog1001 replacement - eventlog1002..

Change 822666 merged by jenkins-bot:

[operations/mediawiki-config@master] Remove reference to unreachable eventlogging-processor service

https://gerrit.wikimedia.org/r/822666

Aug 31 2022, 11:15 PM · Analytics-Radar, hardware-requests, SRE

Aug 12 2022

gerritbot added a project to T184551: EQIAD: (1) hardware request for eventlog1001 replacement - eventlog1002.: Patch-For-Review.
Aug 12 2022, 5:11 PM · Analytics-Radar, hardware-requests, SRE
gerritbot added a comment to T184551: EQIAD: (1) hardware request for eventlog1001 replacement - eventlog1002..

Change 822666 had a related patch set uploaded (by Krinkle; author: Krinkle):

[operations/mediawiki-config@master] Remove reference to unreachable eventlogging-procesor service

https://gerrit.wikimedia.org/r/822666

Aug 12 2022, 5:11 PM · Analytics-Radar, hardware-requests, SRE

Apr 8 2021

MusikAnimal closed T142748: Create proof of concept for cross-wiki watchlist database structure on Beta Cluster, a subtask of T142538: Acquire new hardware for hosting cross-wiki watchlist database, as Declined.
Apr 8 2021, 3:08 PM · Crosswiki, MediaWiki-CrossWikiWatchlist, SRE, Community-Tech, hardware-requests

Jan 12 2021

gerritbot added a comment to T187466: Decommission mw1259-mw1260.

Change 655735 merged by Dzahn:
[operations/puppet@production] DHCP: remove mw1259, mw1260

Jan 12 2021, 7:21 PM · Patch-For-Review, hardware-requests, SRE, ops-eqiad
gerritbot added a comment to T187466: Decommission mw1259-mw1260.

Change 655735 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] DHCP: remove mw1259, mw1260

Jan 12 2021, 7:08 PM · Patch-For-Review, hardware-requests, SRE, ops-eqiad
Dzahn added a comment to T187466: Decommission mw1259-mw1260.

hosts were never removed from DCHP.. cleaning up.

Jan 12 2021, 7:02 PM · Patch-For-Review, hardware-requests, SRE, ops-eqiad

Nov 5 2020

Cmjohnson closed T236327: replace onboard NIC in kafka-jumbo100[1-6], a subtask of T220700: Upgrade kafka-jumbo100[1-6] to 10G NICs (if possible), as Resolved.
Nov 5 2020, 8:21 PM · Analytics-Radar, ops-eqiad, hardware-requests, SRE, User-Elukey

Jun 10 2020

Maintenance_bot removed a project from T182955: Decommission kafka1018: Patch-For-Review.
Jun 10 2020, 7:16 AM · Analytics-Radar, hardware-requests, SRE, ops-eqiad

May 14 2020

gerritbot added a comment to T146455: Decommission labsdb1002.

Change 548257 abandoned by Arturo Borrero Gonzalez:
wmnet: cleanup unused labsdb1002 entries

May 14 2020, 9:51 AM · hardware-requests, Patch-For-Review, ops-eqiad, SRE

May 13 2020

gerritbot added a comment to T146455: Decommission labsdb1002.

Change 596271 merged by Cmjohnson:
[operations/dns@master] Removing and old dns entry for decom host labsdb1002

May 13 2020, 7:13 PM · hardware-requests, Patch-For-Review, ops-eqiad, SRE
gerritbot added a comment to T146455: Decommission labsdb1002.

Change 596271 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Removing and old dns entry for decom host labsdb1002

May 13 2020, 7:10 PM · hardware-requests, Patch-For-Review, ops-eqiad, SRE
Cmjohnson closed T146455: Decommission labsdb1002 as Resolved.
May 13 2020, 6:59 PM · hardware-requests, Patch-For-Review, ops-eqiad, SRE
Cmjohnson added a comment to T146455: Decommission labsdb1002.

Checked everything and all entries removed, server is off rack, removed the storage array attached to it and set to offline and piled for sale.

May 13 2020, 6:58 PM · hardware-requests, Patch-For-Review, ops-eqiad, SRE

May 11 2020

Cmjohnson closed T120856: Remove all out of warranty unused cp10xx's from A2, a subtask of T120679: eqiad out of warranty spares to decommission - approval request, as Resolved.
May 11 2020, 8:10 PM · hardware-requests, SRE

Apr 27 2020

Papaul updated the task description for T146455: Decommission labsdb1002.
Apr 27 2020, 8:22 PM · hardware-requests, Patch-For-Review, ops-eqiad, SRE

Apr 6 2020

ssingh closed T239250: setup/install cescout1001.eqiad.wmnet, a subtask of T238652: Hardware request for Postgres database for censorship monitoring scripts, as Resolved.
Apr 6 2020, 4:29 PM · SRE, hardware-requests

Mar 11 2020

Cmjohnson closed T245754: (Need by: TBD) setup/install sretest100[12].eqiad.wmnet, a subtask of T214024: Two test hosts for SREs, as Resolved.
Mar 11 2020, 8:13 PM · SRE, hardware-requests

Mar 3 2020

RobH archived hardware-requests.
Mar 3 2020, 8:16 PM
RobH closed T204589: eqiad: (1) misc single cpu server allocation for performance browser testing as Declined.

So this has been sitting blocked for months and months. The asset/server referenced has been used elsewhere.

Mar 3 2020, 8:15 PM · Performance-Team (Radar), SRE
RobH closed T207760: setup/install weblog1001/WMF4750 as oxygen replacement, a subtask of T181264: Refresh or replace oxygen, as Resolved.
Mar 3 2020, 6:18 PM · Analytics-Radar, hardware-requests, SRE

Feb 27 2020

Andrew closed T235685: (Need by: 2020-03-02) rack/setup/install cloudvirt-wdqs100[123].eqiad.wmnet, a subtask of T232654: eqiad: three clouvirt-wdqs servers for WDQS testing, as Resolved.
Feb 27 2020, 9:50 PM · DC-Ops, hardware-requests, SRE

Feb 25 2020

Jclark-ctr closed Unknown Object (Task), a subtask of T242885: Expand Eqiad Ganeti row_A capacity, as Resolved.
Feb 25 2020, 10:04 PM · hardware-requests, SRE

Feb 20 2020

RobH closed T214024: Two test hosts for SREs as Resolved.

These will be setup via T245754. Resolving this allocation task.

Feb 20 2020, 5:10 PM · SRE, hardware-requests
RobH added a subtask for T214024: Two test hosts for SREs: T245754: (Need by: TBD) setup/install sretest100[12].eqiad.wmnet.
Feb 20 2020, 5:08 PM · SRE, hardware-requests
faidon reassigned T214024: Two test hosts for SREs from faidon to RobH.

OK, it sounds like @akosiaris and @MoritzMuehlenhoff have coordinated with each other and they can share those two hosts as SRE test hosts.

Feb 20 2020, 5:00 PM · SRE, hardware-requests
akosiaris added a comment to T214024: Two test hosts for SREs.

FYI, I 'll also piggybacking some k8s tests on these hosts as my local env doesn't have enough memory anymore

Feb 20 2020, 4:49 PM · SRE, hardware-requests
RobH added a comment to T214024: Two test hosts for SREs.

This is still pending mgmt approval for allocation of these two spares:

Feb 20 2020, 3:31 PM · SRE, hardware-requests
MoritzMuehlenhoff added a comment to T214024: Two test hosts for SREs.

I had missed the followup. sorry. These two spare hosts would be fine as test hosts!

Feb 20 2020, 1:54 PM · SRE, hardware-requests

Feb 14 2020

wiki_willy added a comment to T146455: Decommission labsdb1002.

Chatted John a bit on this earlier, who was also talking to Rob about it last week. I think we're all good now, and this should be progressing along soon. Thanks, Willy

Feb 14 2020, 1:51 AM · hardware-requests, Patch-For-Review, ops-eqiad, SRE

Feb 13 2020

Dwisehaupt moved T176427: unrack/decom pfw1-codfw and pfw2-codfw from Triage to Done on the fundraising-tech-ops board.
Feb 13 2020, 9:46 PM · hardware-requests, ops-codfw, fundraising-tech-ops, netops, SRE
faidon updated subscribers of T146455: Decommission labsdb1002.

@Jclark-ctr @wiki_willy what's the status here? It sounds like a decom that was only partial and that only needs a few more steps to finalize perhaps?

Feb 13 2020, 2:30 PM · hardware-requests, Patch-For-Review, ops-eqiad, SRE

Feb 11 2020

RobH closed Unknown Object (Task), a subtask of T239675: Add 10G NICs to core site DNS servers (6 servers, 3 per site), as Resolved.
Feb 11 2020, 4:22 PM · hardware-requests, Traffic, SRE

Feb 6 2020

RobH closed T242885: Expand Eqiad Ganeti row_A capacity as Resolved.

memory ordered on T243442 and implementation tracking on T244530. resolving this task

Feb 6 2020, 7:52 PM · hardware-requests, SRE

Jan 22 2020

RobH moved T204589: eqiad: (1) misc single cpu server allocation for performance browser testing from Pending Approval to Stalled on the hardware-requests board.
Jan 22 2020, 7:17 PM · Performance-Team (Radar), SRE
RobH closed T232654: eqiad: three clouvirt-wdqs servers for WDQS testing as Resolved.

fulfilled by T235685, resolving task off @hw-request workboard.

Jan 22 2020, 7:16 PM · DC-Ops, hardware-requests, SRE
RobH added a subtask for T242885: Expand Eqiad Ganeti row_A capacity: Unknown Object (Task).
Jan 22 2020, 6:46 PM · hardware-requests, SRE
RobH moved T242885: Expand Eqiad Ganeti row_A capacity from Backlog to In Discussion / Review on the hardware-requests board.
Jan 22 2020, 6:39 PM · hardware-requests, SRE
RobH reassigned T214024: Two test hosts for SREs from RobH to faidon.

Ok, wmf5175 was ordered and can be allocated as the dual cpu spare pool system currently available in eqiad.

Jan 22 2020, 6:39 PM · SRE, hardware-requests

Jan 21 2020

RobH added a comment to T242885: Expand Eqiad Ganeti row_A capacity.

Ok, next steps for this as far as I can tell:

Jan 21 2020, 10:11 PM · hardware-requests, SRE
wiki_willy assigned T242885: Expand Eqiad Ganeti row_A capacity to RobH.
Jan 21 2020, 10:06 PM · hardware-requests, SRE

Jan 16 2020

thcipriani closed T239880: Replacement hardware for buster/stretch upgrade of contint1001 and contint2001 as Invalid.

So this is a LOT of hardware churn that is non-desired by DC-Ops, at least from my perspective.

If we are upgrading both contint1001 and contint2001, why can't one become active while the other is reimaged/upgraded? Otherwise we have to take these perfectly good servers (R430s) and move them to the spares pool, where they likely to never be re-used due to their relative ages compared to the rest of the spares pool (which are mostly R440s).

The end result is this, in reality, will prematurely end of life these servers currently used for contint assignment. That is non-ideal.

Can this upgrade be handled by upgrading either the codfw or eqiad first to test and then fail to it to upgrade the remainder?

Jan 16 2020, 7:52 PM · Continuous-Integration-Infrastructure (phase-out-jessie), DC-Ops, hardware-requests, SRE

Jan 15 2020

herron created T242885: Expand Eqiad Ganeti row_A capacity.
Jan 15 2020, 4:30 PM · hardware-requests, SRE

Jan 14 2020

RobH added a comment to T239880: Replacement hardware for buster/stretch upgrade of contint1001 and contint2001.
Jan 14 2020, 8:34 PM · Continuous-Integration-Infrastructure (phase-out-jessie), DC-Ops, hardware-requests, SRE
RobH added a comment to T239880: Replacement hardware for buster/stretch upgrade of contint1001 and contint2001.

@thcipriani: Please comment with additional reasoning on why we need to swap the hardware, when compared to my comment above on how this basically puts the existing hosts into EOL, and assign back to me for followup (if needed.)

Jan 14 2020, 8:33 PM · Continuous-Integration-Infrastructure (phase-out-jessie), DC-Ops, hardware-requests, SRE
Morty Proxy This is a proxified and sanitized view of the page, visit original site.