Running 12x GPU on the Biostar TB250-BTC Pro with Simplemining OS

in #mining7 years ago

Here's a quick and easy step by step guide to getting Simplemining OS to run properly with 12 GPUs on the Biostar TB250-BTC Pro motherboard if you're encountering PCIe Bus Error messages.

The problem:

You've hooked up your cards, you've checked your risers, everything seems to be in order, but as soon as you boot up you're flooded with scrolling PCIe Bus Error messages that prevent your rig from even booting up (or cause it to boot extremely slowly and inhibit mining after booting). They probably look something like this -

PCIe Bus Error: Severity=Corrected, type=Physical Layer, id=00db(Receiver ID)
device [8086:aZee] error status/mask=00000001/00002000
[0] receiver error (first)

If this is what you're seeing, fret not! While I'm far from an expert on linux kernels and PCIe bus errors, what I've come to understand about this issue is that these error messages belong to something called Advance Error Reporting which essentially warns you when a PCIe port is behaving strangely and MIGHT fail, but hasn't yet. Whatever error was thrown was quickly resolved, but the system wants to warn you about it. The problem with these messages is that they cause a micro-pause during the system boot, and when several of them are thrown each second during the boot process, they effectively paralyze the boot. Obviously that's not cool when we want to be up and mining ASAP.

After communicating with the smOS dev, who was super helpful during the whole troubleshooting process, and through a few days of research and testing on our end, we eventually found the silver bullet to put an end to this issue and get our 12 card rig up and running with no issues to speak of.

And without further adieu...

The Solution:

In our experience, this error was only thrown after connecting more than 2 GPUs, so first hook up 1 or 2 GPUs and boot into Simplemining OS. If you're like us, you won't get any errors, and once the OS is finished updating etc, you should begin mining normally. Now you need to get into a terminal and make a small change to your Grub config file.

  • Press CTRL + ALT + F3 to open up a new terminal window
  • Log in as root (I won't post the root password here, you can either get it by emailing [email protected] or with a little Google-Fu)
  • Enter "nano etc/default/grub" (without the quotations)
  • Change GRUB_CMDLINE_LINUX_DEFAULT="quiet" to GRUB_CMDLINE_LINUX_DEFAULT="quiet pci=noaer" (this turns off Advance Error Reporting)
  • Press CTRL + X to save
  • Select Yes
  • Hit Enter to confirm the file name
  • Enter "update-grub2"
  • Enter "update-initramfs -u"
  • Shut down the rig
  • Add the rest of your cards
  • Switch the rig back on

The rig should boot right up normally and begin whirring away with all 12 cards. Credit for this solution goes in large part to the Simplemining OS admin as it was his initial suggestion and instructions that got us going on the right path. Huge thanks to him for the prompt and personal support. If you have any other issues or this solution doesn't work for you, feel free to leave a comment and we'll see if we can figure out the issue together.

Sort:  

Great post can you suggest me some graphic cards i have 2 gtx 1060 right now mining on nicehash i am new to mining thing.

I would recommend sticking with 1060's. They can be run ~25MHs, are super low on power for that hash rate so you can hook 12-13 cards on a single 1200 -1600W PSU, depending on how comfortable you are pushing the envelope. find good quality /batch of cards and ask someone to show you the ropes .

What is Simplemining OS, I cant understand it from the website

It's a stripped down linux distro made specifically for easy mining rig configuration and operation. Instead of installing a full Ubuntu or Windows operating system, you can install Simplemining and be ready to mine much more quickly.

Are they charging for it?

When you sign up you have a few dollars in free credits. I forget how much exactly but enough to try it for a few days. Once you've used that up it costs $2/month/rig

For me, adding "update-initramfs -u" screws up the system and I can't boot because it messes up the boot sequence somehow.
Anyway, thank you for your post, but your solution doesn't work for me :) I'll keep trying and let you know if I manage to find my bug

Wow thank you for posting this!! Was getting so frustrated with this board. Got all 12 cards running now. Not getting any PCIe Bus errors anymore!
Blessings on your life for posting this, seriously!

Haha! You're welcome, I'm glad I was able to help out and save someone else from the headache I went through getting this fixed.

Hello maebog can you pls help me ,,i need the password too sad i dont have answer from the admin,,i have 5 Biostar to be set up ..

root password: miner1324

ty for the reply ,,would you mine if you could help me how or what to type in.
i still getting this error
PCIe Bus Error: Severity=Corrected, type=Physical Layer, id=00db(Receiver ID)
device [8086:aZee] error status/mask=00000001/00002000
[0] receiver error (first)

Hey man, sorry for the late reply, I just now noticed there were new comments on this. First please double check that you followed all of the instructions exactly. If you're positive that you didn't have any typos etc, then you can try replacing pci=noaer with pci=nomsi and run through the rest of the steps as I listed them. See if that works. If you're still having trouble get back to me and I will keep a closer eye on this post so I can get back to you in a timely manner.

Thank you for your reply,,too sad i can't make it work when i try to close and save it it says error writing file or directory doesn't exist i follow all the instruction above..

pls enlighten me did i type this correctly

nano etc/default/grub
(there some message on tool bar saying [directory 'etc/default' does not exist]
Change GRUB_CMDLINE_LINUX_DEFAULT="quiet" to GRUB_CMDLINE_LINUX_DEFAULT="quiet pci=noaer"
after pressing CRTL+X it say error writing file or directory doesn't exist..

I'm not sure what's up with that. Try emailing the admin, he'll have a much better idea what's going on than I do. It sounds like something go fucked up with the file structure but I'm not knowledgeable enough when it comes to linux kernels to give you specific advice.

Seems like permissions problems, I've done using miner/miner (user/pass), miner is an admin user which means you better use it and do it as follows:"sudo nano etc/default/grub" (personally I use vi/vim) and update the line as this article suggests (thanks btw for this post).
The latest version does not have "quiet", so adding "pci=noaer" between the quotes is enough.
Best luck and thanks.

There is a typo:

"nano etc/default/grub"
try this
"nano /etc/default/grub"

Should be able to edit it....

Hello Friend.
I am trying to solve this problem with my Mining RIG, but I have a question ... Once I enter with root and then with the key (Simplemining sent it to my mail)
From there I do not know where to make the change that you specify, I mean:
Change GRUB_CMDLINE_LINUX_DEFAULT = "quiet" to GRUB_CMDLINE_LINUX_DEFAULT = "quiet pci = noaer" (this turns off Advance Error Reporting)

That part will not be done ... could you help me?

When you enter nano etc/default/grub, your kernel settings are opened in the text editor Nano. You make the change to the line I specified

When I enter the keys, it enters a black screen, I do not have lines with codes to make the change that you indicate in this chat.

Thank you for this!

i will try it h110 because i am having an issue of hangs and loosing network connectivity when i tried 11+ GPU, its related to ubuntu 16.04 and Skylake CPU

btw theeres a type "nano etc/default/grub " should be nano /etc/default/grub

hey i'm dealing with this problem but i coldnt made root access. in terminal window it says simpleminer login:
now i am typing "root" here right? then it ask password but i cant type anything here. just enter button reacts. can you help me pls

This is just awesome, you're a savior. Just did this on my H110 Pro BTC+

I've spent days trying to figure out what is causing my rigs to hang after the low resolution text and errors. Applied this and haven't encountered that issue since!

I need your wallet address man you deserve a tip!!

I really need some help!!!!
It won't even recognize the usb as a bootable usb (won't show up when I choose boot devices). Any advice???
THANKS!

Coin Marketplace

STEEM 0.26
TRX 0.13
JST 0.032
BTC 61133.31
ETH 2887.29
USDT 1.00
SBD 3.64