Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

brick recovery question #4349

Open
busishe opened this issue May 6, 2024 · 16 comments
Open

brick recovery question #4349

busishe opened this issue May 6, 2024 · 16 comments

Comments

@busishe
Copy link

busishe commented May 6, 2024

I created a replicate volume with three copies, and due to various reasons, two of the bricks went offline. I tried to restart the gluster application, but the bricks did not automatically recover. How should I recover in this situation?

@anon314159
Copy link

anon314159 commented May 8, 2024

What kind of troubleshooting steps have been taken? Generally speaking, GlusterFS consists of two core services: glusterd and glusterfsd. Are both of these services started and operating normally on all of the servers in your trusted pool?

Useful information:
https://docs.gluster.org/en/latest/Troubleshooting/

#Verify service status:
systemctl status glusterd glusterfsd

#Check peer status:
gluster peer status

#Check volume status:
gluster volume status

#Volume heal state:
gluster volume heal

More information is needed in order to further troubleshoot the issues you are experiencing with GlusterFS.

@aravindavk
Copy link
Member

Please share the brick logs from the failed bricks, so that we can try to understand the reasons for failures.

@busishe
Copy link
Author

busishe commented May 9, 2024

thx for reply,.
There are about 40 volumes in cluster running well
The bad volume is a 3 replica volume ,running over 3 years .
Glusterfs verison is 3.12.6
Now status is below:
Status of volume: vol_f2bd9512ed94a21b3033171f6ddc60f9

Gluster process TCP Port RDMA Port Online Pid


Brick 85.1.131.198:/var/lib/heketi/mounts/v

g_a01f99384b3616a47d49d8eccfd4ba1f/brick_66

fb871c367e1afb9105bc9bafdd176c/brick N/A N/A N N/A

Brick 85.1.131.196:/var/lib/heketi/mounts/v

g_4aaadded4740f9da3c3102d28fa2451f/brick_8c

dd41abde963477560620b45ef46fbf/brick N/A N/A N N/A

Brick 85.1.131.197:/var/lib/heketi/mounts/v

g_376a29041f4efcf9bff4a703e53fba73/brick_54

c006a6a0ddac821aab5f0041509d2e/brick 49166 0 Y 13439

Self-heal Daemon on localhost N/A N/A Y 15518

Self-heal Daemon on 85.1.131.198 N/A N/A Y 14691

Self-heal Daemon on 85.1.131.197 N/A N/A Y 45830

Task Status of Volume vol_f2bd9512ed94a21b3033171f6ddc60f9


There are no active volume tasks

i check the brick log on Brick 85.1.131.196 ,found that the brick shutdown on Apr 28.
it seems the brick running well before 2024-04-28 the error log occur.we mount the volume to a k8s pod,the pod will write file in it.
i check the directory on 85.1.131.196 /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick
it not exist .i confused why it be deleted?the java app running on k8s pod cannot delete the brick directory,it only know the mount point directory in pod.

[2024-04-26 10:57:15.069705] I [MSGID: 115029] [server-handshake.c:793:server_setvolume] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-server: accepted client from pbjfhPaaSnfs001-8610-2024/04/26-10:57:13:717856-vol_f2bd9512ed94a21b3033171f6ddc60f9-client-1-0-0 (version: 3.12.6)

[2024-04-28 08:21:51.685880] E [MSGID: 113018] [posix.c:328:posix_lookup] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: post-operation lstat on parent /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick failed [No such file or directory]

[2024-04-28 08:21:51.691384] E [MSGID: 113018] [posix.c:1391:posix_mknod] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: pre-operation lstat on parent of /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick/26-100100202344--44-1.jpg failed [No such file or directory]

[2024-04-28 08:21:52.092025] E [MSGID: 113018] [posix.c:1391:posix_mknod] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: pre-operation lstat on parent of /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick/9-108--15-1.jpg failed [No such file or directory]

[2024-04-28 08:21:52.291662] E [MSGID: 113018] [posix.c:1391:posix_mknod] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: pre-operation lstat on parent of /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick/9-108--16-1.jpg failed [No such file or directory]

[2024-04-28 08:21:52.490189] E [MSGID: 113018] [posix.c:1391:posix_mknod] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: pre-operation lstat on parent of /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick/9-108--16-2.jpg failed [No such file or directory]

The message "E [MSGID: 113018] [posix.c:328:posix_lookup] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: post-operation lstat on parent /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick failed [No such file or directory]" repeated 4 times between [2024-04-28 08:21:51.685880] and [2024-04-28 08:21:52.684661]

[2024-04-28 08:21:52.689445] E [MSGID: 113018] [posix.c:1391:posix_mknod] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: pre-operation lstat on parent of /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick/25-134--22-1.jpg failed [No such file or directory]

[2024-04-28 08:21:53.084543] E [MSGID: 113018] [posix.c:328:posix_lookup] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: post-operation lstat on parent /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick failed [No such file or directory]

[2024-04-28 08:21:53.090349] E [MSGID: 113018] [posix.c:1391:posix_mknod] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: pre-operation lstat on parent of /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick/25-143--99-1.jpg failed [No such file or directory]

[2024-04-28 08:21:53.285385] E [MSGID: 113018] [posix.c:328:posix_lookup] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: post-operation lstat on parent /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick failed [No such file or directory]

[2024-04-28 08:21:53.290540] E [MSGID: 113018] [posix.c:1391:posix_mknod] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: pre-operation lstat on parent of /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick/9-110--15-1.jpg failed [No such file or directory]

[2024-04-28 08:21:53.486033] E [MSGID: 113018] [posix.c:328:posix_lookup] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: post-operation lstat on parent /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick failed [No such file or directory]

[2024-04-28 08:21:53.490409] E [MSGID: 113018] [posix.c:1391:posix_mknod] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: pre-operation lstat on parent of /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick/9-110--16-1.jpg failed [No such file or directory]

[2024-04-28 08:21:54.084729] E [MSGID: 113018] [posix.c:328:posix_lookup] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: post-operation lstat on parent /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick failed [No such file or directory]

[2024-04-28 08:21:54.089217] E [MSGID: 113018] [posix.c:1391:posix_mknod] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: pre-operation lstat on parent of /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick/9-110--16-2.jpg failed [No such file or directory]

[2024-04-28 08:21:54.285508] E [MSGID: 113018] [posix.c:328:posix_lookup] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: post-operation lstat on parent /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick failed [No such file or directory]

[2024-04-28 08:21:54.290162] E [MSGID: 113018] [posix.c:1391:posix_mknod] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: pre-operation lstat on parent of /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick/9-110--16-3.jpg failed [No such file or directory]

[2024-04-28 08:21:54.484343] E [MSGID: 113018] [posix.c:328:posix_lookup] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: post-operation lstat on parent /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick failed [No such file or directory]

[2024-04-28 08:21:54.488961] E [MSGID: 113018] [posix.c:1391:posix_mknod] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: pre-operation lstat on parent of /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick/9-110--16-4.jpg failed [No such file or directory]

[2024-04-28 08:22:00.618079] W [MSGID: 113075] [posix-helpers.c:1851:posix_fs_health_check] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: open() on /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick/.glusterfs/health_check returned [No such file or directory]

[2024-04-28 08:22:00.618237] M [MSGID: 113075] [posix-helpers.c:1917:posix_health_check_thread_proc] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: health-check failed, going down

[2024-04-28 08:22:00.618541] M [MSGID: 113075] [posix-helpers.c:1936:posix_health_check_thread_proc] 0-vol_f2bd9512ed94a21b3033171f6ddc60f9-posix: still alive! -> SIGTERM

[2024-04-28 08:22:30.619030] W [glusterfsd.c:1375:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x8744) [0x7fb424fa8744] -->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xc5) [0x4096b5] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x5f) [0x4094ef] ) 0-: received signum (15), shutting down

@anon314159
Copy link

Silly question, are you sure the bricks and associated subdirectories for that particular volume are mounted and accessible?

@busishe
Copy link
Author

busishe commented May 9, 2024

If you don't have a helpful problem-solving approach, you don't need to reply.I make sure the volume is mounted and accessible before Apr 28.In the log you can see ,the brick run good before [2024-04-28 08:21:51.685880] ,the offline brick on server 198/196 print the error log at same time,and then the brick shutdown.I simulated the problem in the testing environment,it can be recovery by brick replace.The wrong volume change its ro mode to rw. What i want to know is why the brick subdir disapear(/data/vg_xxx/brick_xxx/brick).The mount point (like /data/vg_xxx/brick_xxx/) still exist.

@anon314159
Copy link

If you don't have a helpful problem-solving approach, you don't need to reply.I make sure the volume is mounted and accessible before Apr 28.In the log you can see ,the brick run good before [2024-04-28 08:21:51.685880] ,the offline brick on server 198/196 print the error log at same time,and then the brick shutdown.I simulated the problem in the testing environment,it can be recovery by brick replace.The wrong volume change its ro mode to rw. What i want to know is why the brick subdir disapear(/data/vg_xxx/brick_xxx/brick).The mount point (like /data/vg_xxx/brick_xxx/) still exist.

Interesting because the logs reveal that the Glusterfsd is not able to communicate with the underlying storage associated with those bricks. Which explains perfectly as to why those bricks refuse to go into an online state. Again, that is indicative of either a double mount, possible fsid misidentification fsd, or there's something wrong with the underlying file system on those bricks. If you're going to be aggressive or posture with responses, don't expect anyone to provide help. Replacing bricks with the AFR requires you synchronize a source and destination break. There may not be an automatic type of recovery certain situations. You didn't provide any of that information. It's a matter of just guessing as to the root cause. What triggered this event? You attempt to replace a brick using glustercli? Was there an abrupt shutdown of the system that caused the underlying file system to enter a journal wrapp state(assumes xfs or zfs). There's an awful lot of assumptions that have to be made based on limited information you provided. Such as what events? Precipitated the situation or what troubleshooting has been done?. And if it is reproducible how exactly did you reproduce it? None of those questions have been asked or answered.

@anon314159
Copy link

anon314159 commented May 9, 2024

Assuming there's nothing wrong with the underlying bricks storage devices, mount points and there's no issues with split-brain the self-heal daemon should automatically resolve this issue. Are you sure nothing tainted or corrupted the underlying faulty bricks file system or directory structure associated with that particular volume? Since you have other volumes working correctly are those using the same block devices as the faulty bricks? The error "posix: health-check failed, going down" is indicative of glusterfsd crashing for those specific bricks due to the storage.health-check-timeout default value exceeding a certain threshold (i.e. possibly faultily hardware or bug).

#1168

If a brick’s underlying filesystem/lvm was damaged and fsck’d /xfs_repair to recovery, some files/dirs might be missing on it. If there is a lot of missing info on the recovered bricks, it might be better to just to a replace-brick or reset-brick and let the heal fully sync everything rather than fiddling with afr xattrs of individual entries.

@busishe
Copy link
Author

busishe commented May 10, 2024

I say the word just because you don't give any advise or answer ,just reply me a 'silly question'.If you just want to mock rather than solve the problem, you can no longer reply to me.I think raising technical questions should not be ridiculed.

No wrong with mount point and no splitbrain occur,but the brick subdirectory was removed.I think self-heal shouldnt work in this situation .I check the whole cluster,there are 5 bricks have the same issue.And the log just say [No such file or directory].I wonder which process can remove these bricks subdir.Our servers based on local network, nobody login in the server these days.

If you don't have a helpful problem-solving approach, you don't need to reply.I make sure the volume is mounted and accessible before Apr 28.In the log you can see ,the brick run good before [2024-04-28 08:21:51.685880] ,the offline brick on server 198/196 print the error log at same time,and then the brick shutdown.I simulated the problem in the testing environment,it can be recovery by brick replace.The wrong volume change its ro mode to rw. What i want to know is why the brick subdir disapear(/data/vg_xxx/brick_xxx/brick).The mount point (like /data/vg_xxx/brick_xxx/) still exist.

Interesting because the logs reveal that the Glusterfsd is not able to communicate with the underlying storage associated with those bricks. Which explains perfectly as to why those bricks refuse to go into an online state. Again, that is indicative of either a double mount, possible fsid misidentification fsd, or there's something wrong with the underlying file system on those bricks. If you're going to be aggressive or posture with responses, don't expect anyone to provide help. Replacing bricks with the AFR requires you synchronize a source and destination break. There may not be an automatic type of recovery certain situations. You didn't provide any of that information. It's a matter of just guessing as to the root cause. What triggered this event? You attempt to replace a brick using glustercli? Was there an abrupt shutdown of the system that caused the underlying file system to enter a journal wrapp state(assumes xfs or zfs). There's an awful lot of assumptions that have to be made based on limited information you provided. Such as what events? Precipitated the situation or what troubleshooting has been done?. And if it is reproducible how exactly did you reproduce it? None of those questions have been asked or answered.

@anon314159
Copy link

I say the word just because you don't give any advise or answer ,just reply me a 'silly question'.If you just want to mock rather than solve the problem, you can no longer reply to me.I think raising technical questions should not be ridiculed.

No wrong with mount point and no splitbrain occur,but the brick subdirectory was removed.I think self-heal shouldnt work in this situation .I check the whole cluster,there are 5 bricks have the same issue.And the log just say [No such file or directory].I wonder which process can remove these bricks subdir.Our servers based on local network, nobody login in the server these days.

If you don't have a helpful problem-solving approach, you don't need to reply.I make sure the volume is mounted and accessible before Apr 28.In the log you can see ,the brick run good before [2024-04-28 08:21:51.685880] ,the offline brick on server 198/196 print the error log at same time,and then the brick shutdown.I simulated the problem in the testing environment,it can be recovery by brick replace.The wrong volume change its ro mode to rw. What i want to know is why the brick subdir disapear(/data/vg_xxx/brick_xxx/brick).The mount point (like /data/vg_xxx/brick_xxx/) still exist.

Interesting because the logs reveal that the Glusterfsd is not able to communicate with the underlying storage associated with those bricks. Which explains perfectly as to why those bricks refuse to go into an online state. Again, that is indicative of either a double mount, possible fsid misidentification fsd, or there's something wrong with the underlying file system on those bricks. If you're going to be aggressive or posture with responses, don't expect anyone to provide help. Replacing bricks with the AFR requires you synchronize a source and destination break. There may not be an automatic type of recovery certain situations. You didn't provide any of that information. It's a matter of just guessing as to the root cause. What triggered this event? You attempt to replace a brick using glustercli? Was there an abrupt shutdown of the system that caused the underlying file system to enter a journal wrapp state(assumes xfs or zfs). There's an awful lot of assumptions that have to be made based on limited information you provided. Such as what events? Precipitated the situation or what troubleshooting has been done?. And if it is reproducible how exactly did you reproduce it? None of those questions have been asked or answered.

If the underlying directories have been deleted and they're associated with a particular volume. It's safe to assume you will need to remove that brick from the volume, delete the underlying directory structure, recreate it and then add it back into the volume. Normally resetting the brick should be sufficient but I've seen issues with that not working if you are trying to point it to the very same directory that was originally configured as a brick. The easiest solution is to simply remove the brick from the volume, either delete the suspected top level directory or reset its attributes using the setattr command, and then simply add that brick back to the volume. Afr seems to run into issues if files or folders are deleted outside of traditional fuse mount. It's generally not a good idea to manually delete files or folders whether it be intentional or say via file system corruption. Under certain circumstances there's no other way to resolve the matter other than failing the brick and then re-adding it back to the volume.

@anon314159
Copy link

Also, there is no silly questions in this situation. There's a default template designed to insinuate the level of troubleshooting you've done prior to submitting a request. Also, you tile this as a question, when it doesn't seem like an actual question. It's more like an issue or bug. If you are generally interested in understanding how it works, the documentation describes the architecture and descriptions involved with simple troubleshooting steps. The original question was incredibly vague and non-specific. So I'm forced to go the very simple route of asking. What have you done to troubleshoot. Now that I know the underlying bricks no longer have Data stored on them. My original conclusion was correct. That either the data was deleted, it's potentially corrupt or something went wrong when you attempted to replace /reset that brick. The documentation has very specific steps about resetting and replacing bricks. That is, you cannot use the same directory unless you either delete the underlying data or reset the attributes on that folder prior to using the replace or reset brick command. It's not meant to insult you or ask simplistic questions, it's a matter of establishing what steps you've taken to troubleshoot and reproduce the problem. Under this situation it's a matter of again, verifying the underlying file system and directories are not corrupt, the block devices associated with those bricks are functioning correctly, using the correct process to reset bricks and then invoking heal on the volume. I reviewed the log data you sent me and determined that the glusterfsd Damon is unable to effectively pass a health check against the folders you've tagged as bricks. Which means either the data structure is missing, the block devices are not mounted on the respective peer, or something went wrong with you. Resetting/replacing the affected bricks. Making sure the peers are working, the volume is started, all of the bricks are functioning correctly, and dumping volume and state information are simple. Rudimentary questions so that I can better help resolve your issue. It is in no way an attempt to condescend or patronize you.

@busishe
Copy link
Author

busishe commented May 10, 2024

Well, thank you for reply.At first, I thought it was a simple problem where the brick couldn't start. By replacing the brick, I had already resolved it in the testing environment. Now I hope to understand the reason why Brick cannot start, so things have become complicated.Today I checked the glusterd.log and OS message log.

glusterd.log:
[2024-04-28 08:22:30.619247] I [MSGID: 106144] [glusterd-pmap.c:396:pmap_registry_remove] 0-pmap: removing brick /var/lib/heketi/mounts/vg_4aaadded4740f9da3c3102d28fa2451f/brick_8cdd41abde963477560620b45ef46fbf/brick on port 49170
[2024-04-28 08:22:30.622639] W [socket.c:593:__socket_rwv] 0-management: readv on /var/run/gluster/833e2bfc86def78b22b83d30542015ed.socket failed (No data available)

It seems like ,gluster process removed the brick with nobody operate it.

And OS message log shows :
2024-04-28T08:37:57.005643+08:00 pbjfhPaaSnfs001 lvm[3476]: Failed to extend thin vg_4aaadded4740f9da3c3102d28fa2451f-tp_8cdd41abde963477560620b45ef46fbf-tpool.
2024-04-28T08:37:58.004648+08:00 pbjfhPaaSnfs001 lvm[3476]: /run/lvm/lock/V_vg_4aaadded4740f9da3c3102d28fa2451f: open failed: Too many open files

I'm not sure if client app do wrong operation on these volume, e.g. file stream not closed.

@anon314159
Copy link

anon314159 commented May 10, 2024

I say the word just because you don't give any advise or answer ,just reply me a 'silly question'.If you just want to mock rather than solve the problem, you can no longer reply to me.I think raising technical questions should not be ridiculed.

No wrong with mount point and no splitbrain occur,but the brick subdirectory was removed.I think self-heal shouldnt work in this situation .I check the whole cluster,there are 5 bricks have the same issue.And the log just say [No such file or directory].I wonder which process can remove these bricks subdir.Our servers based on local network, nobody login in the server these days.

If you don't have a helpful problem-solving approach, you don't need to reply.I make sure the volume is mounted and accessible before Apr 28.In the log you can see ,the brick run good before [2024-04-28 08:21:51.685880] ,the offline brick on server 198/196 print the error log at same time,and then the brick shutdown.I simulated the problem in the testing environment,it can be recovery by brick replace.The wrong volume change its ro mode to rw. What i want to know is why the brick subdir disapear(/data/vg_xxx/brick_xxx/brick).The mount point (like /data/vg_xxx/brick_xxx/) still exist.

Interesting because the logs reveal that the Glusterfsd is not able to communicate with the underlying storage associated with those bricks. Which explains perfectly as to why those bricks refuse to go into an online state. Again, that is indicative of either a double mount, possible fsid misidentification fsd, or there's something wrong with the underlying file system on those bricks. If you're going to be aggressive or posture with responses, don't expect anyone to provide help. Replacing bricks with the AFR requires you synchronize a source and destination break. There may not be an automatic type of recovery certain situations. You didn't provide any of that information. It's a matter of just guessing as to the root cause. What triggered this event? You attempt to replace a brick using glustercli? Was there an abrupt shutdown of the system that caused the underlying file system to enter a journal wrapp state(assumes xfs or zfs). There's an awful lot of assumptions that have to be made based on limited information you provided. Such as what events? Precipitated the situation or what troubleshooting has been done?. And if it is reproducible how exactly did you reproduce it? None of those questions have been asked or answered.

When using distributed or replicated volumes gluster maps and creates the directory structure on all bricks/sub volumes within the volume. Files are synced between replicated volumes based on via size, hashes and the mtime attributes. In the event one or more of the bricks within a particular volume lose access to that metadata, or layout/mapping, Glusters AFR translator will usually not heal those files automatically and the FSD may crash or fail to start against the affected bricks. In a general this may require administrative action in order to identify and fix the underlying root cause. It usually involves verifying the underlying file system hosting the bricks is healthy and accessible, all of the files and folders are synced (replicated volumes), and there no split-brain between the files. What you are alluding is an issue with thr behavior of AFR and self-heal process. I suspect events external to gluster trashed the file/folder structure of several bricks and the fsd refuses to bind to those bricks. I believe this is normal behavior as to prevent the AFR from replicating bad or garbage data to otherwise healthy bricks in the volume. What's missing is documentation that describes the proper process for safely resetting/replacing failed bricks on an active volume or the expected behavior during a heal.

@busishe
Copy link
Author

busishe commented May 10, 2024

You make sense, are you the developer of this project? I tried mounting a new volume on my application today and then copying data from the old volume to this new volume. There are over 10000 files in total, but they are all very small files, with a total of less than 2GB. After the replication was completed, I observed that one brick of the new volume had already gone offline, and the OS log output an error attempting to obtain the lvm lock failed. Perhaps this ancient version (3.12) of gluster has some performance bottlenecks for reading and writing a large number of small files. But due to OS limitations, we are unable to upgrade it to a new version

I say the word just because you don't give any advise or answer ,just reply me a 'silly question'.If you just want to mock rather than solve the problem, you can no longer reply to me.I think raising technical questions should not be ridiculed.
No wrong with mount point and no splitbrain occur,but the brick subdirectory was removed.I think self-heal shouldnt work in this situation .I check the whole cluster,there are 5 bricks have the same issue.And the log just say [No such file or directory].I wonder which process can remove these bricks subdir.Our servers based on local network, nobody login in the server these days.

If you don't have a helpful problem-solving approach, you don't need to reply.I make sure the volume is mounted and accessible before Apr 28.In the log you can see ,the brick run good before [2024-04-28 08:21:51.685880] ,the offline brick on server 198/196 print the error log at same time,and then the brick shutdown.I simulated the problem in the testing environment,it can be recovery by brick replace.The wrong volume change its ro mode to rw. What i want to know is why the brick subdir disapear(/data/vg_xxx/brick_xxx/brick).The mount point (like /data/vg_xxx/brick_xxx/) still exist.

Interesting because the logs reveal that the Glusterfsd is not able to communicate with the underlying storage associated with those bricks. Which explains perfectly as to why those bricks refuse to go into an online state. Again, that is indicative of either a double mount, possible fsid misidentification fsd, or there's something wrong with the underlying file system on those bricks. If you're going to be aggressive or posture with responses, don't expect anyone to provide help. Replacing bricks with the AFR requires you synchronize a source and destination break. There may not be an automatic type of recovery certain situations. You didn't provide any of that information. It's a matter of just guessing as to the root cause. What triggered this event? You attempt to replace a brick using glustercli? Was there an abrupt shutdown of the system that caused the underlying file system to enter a journal wrapp state(assumes xfs or zfs). There's an awful lot of assumptions that have to be made based on limited information you provided. Such as what events? Precipitated the situation or what troubleshooting has been done?. And if it is reproducible how exactly did you reproduce it? None of those questions have been asked or answered.

When using distributed or replicated volumes gluster maps and creates the directory structure on all bricks/sub volumes within the volume. Files are synced between replicated volumes based on via size, hashes and the mtime attributes. In the event one or more of the bricks within a particular volume lose access to that metadata, or layout/mapping, Glusters AFR translator will usually not heal those files automatically and the FSD may crash or fail to start against the affected bricks. In a general this may require administrative action in order to identify and fix the underlying root cause. It usually involves verifying the underlying file system hosting the bricks is healthy and accessible, all of the files and folders are synced (replicated volumes), and there no split-brain between the files. What you are alluding is an issue with thr behavior of AFR and self-heal process. I suspect events external to gluster trashed the file/folder structure of several bricks and the fsd refuses to bind to those bricks. I believe this is normal behavior as to prevent the AFR from replicating bad or garbage data to otherwise healthy bricks in the volume. What's missing is documentation that describes the proper process for safely resetting/replacing failed bricks on an active volume or the expected behavior during a heal.

@anon314159
Copy link

Sadly, I am not a dev and based on the version of gluster you are running the dev's are less inclined to help unless you upgrade to a supported version. Based on the kernel messages above, this seems more like a file system problem or lack of space/inodes on one or more LVM's attached to your affected volumes. Did you run out of space/extends or attempt to alter the size of the effected bricks?

@anon314159
Copy link

You make sense, are you the developer of this project? I tried mounting a new volume on my application today and then copying data from the old volume to this new volume. There are over 10000 files in total, but they are all very small files, with a total of less than 2GB. After the replication was completed, I observed that one brick of the new volume had already gone offline, and the OS log output an error attempting to obtain the lvm lock failed. Perhaps this ancient version (3.12) of gluster has some performance bottlenecks for reading and writing a large number of small files. But due to OS limitations, we are unable to upgrade it to a new version

I say the word just because you don't give any advise or answer ,just reply me a 'silly question'.If you just want to mock rather than solve the problem, you can no longer reply to me.I think raising technical questions should not be ridiculed.
No wrong with mount point and no splitbrain occur,but the brick subdirectory was removed.I think self-heal shouldnt work in this situation .I check the whole cluster,there are 5 bricks have the same issue.And the log just say [No such file or directory].I wonder which process can remove these bricks subdir.Our servers based on local network, nobody login in the server these days.

If you don't have a helpful problem-solving approach, you don't need to reply.I make sure the volume is mounted and accessible before Apr 28.In the log you can see ,the brick run good before [2024-04-28 08:21:51.685880] ,the offline brick on server 198/196 print the error log at same time,and then the brick shutdown.I simulated the problem in the testing environment,it can be recovery by brick replace.The wrong volume change its ro mode to rw. What i want to know is why the brick subdir disapear(/data/vg_xxx/brick_xxx/brick).The mount point (like /data/vg_xxx/brick_xxx/) still exist.

Interesting because the logs reveal that the Glusterfsd is not able to communicate with the underlying storage associated with those bricks. Which explains perfectly as to why those bricks refuse to go into an online state. Again, that is indicative of either a double mount, possible fsid misidentification fsd, or there's something wrong with the underlying file system on those bricks. If you're going to be aggressive or posture with responses, don't expect anyone to provide help. Replacing bricks with the AFR requires you synchronize a source and destination break. There may not be an automatic type of recovery certain situations. You didn't provide any of that information. It's a matter of just guessing as to the root cause. What triggered this event? You attempt to replace a brick using glustercli? Was there an abrupt shutdown of the system that caused the underlying file system to enter a journal wrapp state(assumes xfs or zfs). There's an awful lot of assumptions that have to be made based on limited information you provided. Such as what events? Precipitated the situation or what troubleshooting has been done?. And if it is reproducible how exactly did you reproduce it? None of those questions have been asked or answered.

When using distributed or replicated volumes gluster maps and creates the directory structure on all bricks/sub volumes within the volume. Files are synced between replicated volumes based on via size, hashes and the mtime attributes. In the event one or more of the bricks within a particular volume lose access to that metadata, or layout/mapping, Glusters AFR translator will usually not heal those files automatically and the FSD may crash or fail to start against the affected bricks. In a general this may require administrative action in order to identify and fix the underlying root cause. It usually involves verifying the underlying file system hosting the bricks is healthy and accessible, all of the files and folders are synced (replicated volumes), and there no split-brain between the files. What you are alluding is an issue with thr behavior of AFR and self-heal process. I suspect events external to gluster trashed the file/folder structure of several bricks and the fsd refuses to bind to those bricks. I believe this is normal behavior as to prevent the AFR from replicating bad or garbage data to otherwise healthy bricks in the volume. What's missing is documentation that describes the proper process for safely resetting/replacing failed bricks on an active volume or the expected behavior during a heal.

Also, I think your system may simply be running out of file handles/descriptors. Depending on your operating system, you may need to adjust these limits via sysctl. Older versions of Gluster have a tendency to crash the FSD when there's too many open files handles or extremely heavy I/O. I'm currently running 10.x and occasionally run into issues with sporadic fsd crashes due to heavy amounts of I/O or system memory pressure. So it seems issues like that have been fixing version 11.x

@busishe
Copy link
Author

busishe commented May 16, 2024

You make sense, are you the developer of this project? I tried mounting a new volume on my application today and then copying data from the old volume to this new volume. There are over 10000 files in total, but they are all very small files, with a total of less than 2GB. After the replication was completed, I observed that one brick of the new volume had already gone offline, and the OS log output an error attempting to obtain the lvm lock failed. Perhaps this ancient version (3.12) of gluster has some performance bottlenecks for reading and writing a large number of small files. But due to OS limitations, we are unable to upgrade it to a new version

I say the word just because you don't give any advise or answer ,just reply me a 'silly question'.If you just want to mock rather than solve the problem, you can no longer reply to me.I think raising technical questions should not be ridiculed.
No wrong with mount point and no splitbrain occur,but the brick subdirectory was removed.I think self-heal shouldnt work in this situation .I check the whole cluster,there are 5 bricks have the same issue.And the log just say [No such file or directory].I wonder which process can remove these bricks subdir.Our servers based on local network, nobody login in the server these days.

If you don't have a helpful problem-solving approach, you don't need to reply.I make sure the volume is mounted and accessible before Apr 28.In the log you can see ,the brick run good before [2024-04-28 08:21:51.685880] ,the offline brick on server 198/196 print the error log at same time,and then the brick shutdown.I simulated the problem in the testing environment,it can be recovery by brick replace.The wrong volume change its ro mode to rw. What i want to know is why the brick subdir disapear(/data/vg_xxx/brick_xxx/brick).The mount point (like /data/vg_xxx/brick_xxx/) still exist.

Interesting because the logs reveal that the Glusterfsd is not able to communicate with the underlying storage associated with those bricks. Which explains perfectly as to why those bricks refuse to go into an online state. Again, that is indicative of either a double mount, possible fsid misidentification fsd, or there's something wrong with the underlying file system on those bricks. If you're going to be aggressive or posture with responses, don't expect anyone to provide help. Replacing bricks with the AFR requires you synchronize a source and destination break. There may not be an automatic type of recovery certain situations. You didn't provide any of that information. It's a matter of just guessing as to the root cause. What triggered this event? You attempt to replace a brick using glustercli? Was there an abrupt shutdown of the system that caused the underlying file system to enter a journal wrapp state(assumes xfs or zfs). There's an awful lot of assumptions that have to be made based on limited information you provided. Such as what events? Precipitated the situation or what troubleshooting has been done?. And if it is reproducible how exactly did you reproduce it? None of those questions have been asked or answered.

When using distributed or replicated volumes gluster maps and creates the directory structure on all bricks/sub volumes within the volume. Files are synced between replicated volumes based on via size, hashes and the mtime attributes. In the event one or more of the bricks within a particular volume lose access to that metadata, or layout/mapping, Glusters AFR translator will usually not heal those files automatically and the FSD may crash or fail to start against the affected bricks. In a general this may require administrative action in order to identify and fix the underlying root cause. It usually involves verifying the underlying file system hosting the bricks is healthy and accessible, all of the files and folders are synced (replicated volumes), and there no split-brain between the files. What you are alluding is an issue with thr behavior of AFR and self-heal process. I suspect events external to gluster trashed the file/folder structure of several bricks and the fsd refuses to bind to those bricks. I believe this is normal behavior as to prevent the AFR from replicating bad or garbage data to otherwise healthy bricks in the volume. What's missing is documentation that describes the proper process for safely resetting/replacing failed bricks on an active volume or the expected behavior during a heal.

Also, I think your system may simply be running out of file handles/descriptors. Depending on your operating system, you may need to adjust these limits via sysctl. Older versions of Gluster have a tendency to crash the FSD when there's too many open files handles or extremely heavy I/O. I'm currently running 10.x and occasionally run into issues with sporadic fsd crashes due to heavy amounts of I/O or system memory pressure. So it seems issues like that have been fixing version 11.x

thx!I have the same speculation and have already submitted the issue to my colleague in charge of OS for investigation. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants